Computer Science Thesis Oral
In Person - Reddy Conference Room, Gates Hillman 4405
LESLIE RICE , Ph.D. Candidate, Computer Science Department, Carnegie Mellon University
Methods for robust training and evaluation of deep neural networks
As machine learning systems are deployed in real-world, safety-critical applications, it becomes increasingly important to ensure these systems are robust and trustworthy. The study of machine learning robustness gained a significant amount of interest upon discovering the brittle nature of deep neural networks. Intrigue and concerns about this behavior have resulted in a significant body of work on adversarial robustness, which studies a model’s performance on worst-case perturbed inputs, known as adversarial examples. In the first chapter of this thesis, we present improvements on adversarial training methods for developing empirically robust deep networks.
First, we show that with certain modifications, adversarial training using the fast gradient sign method can result in models that are significantly more robust than previously thought possible, while retaining a much lower training cost compared to alternative adversarial training methods. We then discuss our findings on the harmful effects of overfitting that occurs during adversarial training, and show that by using validation-based early stopping, the robust test performance of an adversarially trained model can be drastically improved. An increasing interest in more natural, non-adversarial settings of robustness has led researchers to alternatively measure robustness in terms of a model’s average performance on randomly sampled input corruptions, a notion which also underlies standard data augmentation strategies.
In the second chapter of this thesis, we generalize the seemingly separate notions of average and worst-case robustness under a unifying framework that allows us to evaluate models on a wide spectrum of robustness levels. For practical use, we introduce a path sampling-based method for accurately approximating this intermediate robustness objective. We show that we can train models to intermediate levels of robustness using this objective, and further explore alternative, more efficient methods for training that bridge the gap between average and worst-case robustness.
Lastly, we use this metric to analyze and compare deep networks in zero-shot and fine-tuned settings to better understand the effects of large-scale pre-training and fine-tuning on robustness.
J. Zico Kolter (Chair)
Nicholas Carlini (Google Brain)