Emily Black (Un)Fairness Along the AI Pipeline: Problems and Solutions Degree Type: Ph.D. in Computer Science Advisor(s): Matt Fredrikson Graduated: August 2022 Abstract: Artificial Intelligence (AI) systems now influence decisions impacting every aspect of people's lives, from the news articles they read, to whether or not they receive a loan. While the use of AI may lead to great accuracy and efficiency in the making of these important decisions, recent news and research reports have shown that AI models can act unfairly: from exhibiting gender bias in hiring models, to racial bias in recidivism prediction systems. This thesis explores new methods for understanding and mitigating fairness issues in AI through considering how choices made throughout the process of creating an AI system–i.e., the modeling pipeline–impacts fairness behavior. First, I will show how considering a model's end-to-end pipeline allows us to expand our understanding of unfair model behavior. In particular, my work introduces a connection between AI system stability and fairness by demonstrating how instability in certain parts of the modeling pipeline, namely the learning rule, can lead to unfairness by having important decisions rely on arbitrary modeling choices. Secondly, I will discuss how considering ML pipelines can help us expand our toolbox of bias mitigation techniques. In a case study investigating equity with respect to income in tax auditing practices, I will demonstrate how interventions made along the AI creation pipeline–even those not related to fairness on their face–can not only be effective for increasing fairness, but can often reduce tradeoffs between predictive utility and fairness. Finally, I will close with an overview of the benefits and dangers of the flexibility that the AI modeling pipeline affords practitioners in the creation of their models, including a discussion of the the legal repercussions of this flexibility, which I call model multiplicity. Thesis Committee: Matt Fredrikson (Chair) Alexandra Chouldechova Rayid Ghani Hoda Heidari Solon Barocas (Microsoft Research) Srinivasan Seshan, Head, Computer Science Department Martial Hebert, Dean, School of Computer Science Keywords: Artificial Intelligence, Machine Learning, Deep Networks, Neural Networks, Deep Learning, Ethics, Fairness, Accountability, Explainability, Public Policy, machine learning pipeline, AI pipeline, stability, consistency, inconsistency, ensembling, counterfactual explanations, leave-one-out unfairness, tax policy, vertical equity, model multiplicity CMU-CS-22-121.pdf (9.68 MB) ( 188 pages) Copyright Notice