Shayak Sen Influence-directed Explanations for Machine Learning Degree Type: Ph.D. in Computer Science Advisor(s): Anupam Datta Graduated: May 2018 Abstract: Increasingly, decisions and actions affecting people's lives are determined by automated systems processing personal data. Excitement about these systems has been accompanied by serious concerns about their opacity and the threats that they pose to privacy, fairness, and other values. Recognizing these concerns, it is important to make real-world automated decision-making systems accountable for privacy and fairness by enabling them to detect and explain violations of these values. System maintainers may leverage such accounts to repair systems to avoid future violations with minimal impact on the utility goals. In this dissertation, we provide a basis for explaining how machine learning systems use information. These explanations increase trust in the functioning of the system, allowing us to verify that they make not only right decisions but also for justifiable reasons. Further, explanations can be used to support detection of privacy and fairness violations, as well as explain how they came about. We can then leverage this understanding to repair systems to avoid future violations. We identify two major challenges to explaining information use in machine learning systems: (i) converged use, that machine learning systems typically combine a large number of input features, and (ii) indirect use, that these systems can typically infer and use information that is not directly provided to the system. Our approach to explaining how complex machine learning models use information involves answering two questions: (influence) Which factors were influential in determining outcomes?, and (interpretation) What do these factors mean? We first present key results measuring the causal influence of factors in machine learning models. We then examine the following settings: (i) systems with potential indirect use of information, and (ii) convolutional neural networks. For each setting we demonstrate how influence and interpretation combine to account for information use. Thesis Committee: Anupam Datta (Chair) Jaime Carbonell Mark Fredrikson Sriram K. Rajamani (Microsoft Research India) Jeannette M. Wing (Columbia University) Frank Pfenning, Head, Computer Science Department Andrew W. Moore, Dean, School of Computer Science CMU-CS-18-107.pdf (3.06 MB) ( 134 pages) Copyright Notice