Special Artificial Intelligence Seminar

— 11:00am

Location:
In Person and Virtual - ET - ASA Conference Room, Gates Hillman 6115 and Zoom

Speaker:
SANAE LOFTI, Ph.D. Student in Data Science, Center for Data Science, New York University
https://sanaelotfi.github.io/


Are the Marginal Likelihood and PAC-Bayes Bounds the right proxies for Generalization?

How do we compare between hypotheses that are entirely consistent with observations? The marginal likelihood, which represents the probability of generating our observations from a prior, provides a distinctive approach to this foundational question. We first highlight the conceptual and practical issues in using the marginal likelihood as a proxy for generalization. Namely, we show how the marginal likelihood can be negatively correlated with generalization and can lead to both underfitting and overfitting in hyperparameter learning. We provide a partial remedy through a conditional marginal likelihood, which we show to be more aligned with generalization, and practically valuable for large-scale hyperparameter learning, such as in deep kernel learning.

PAC-Bayes bounds are another expression of Occam’s razor where simpler descriptions of the data generalize better. While there has been progress in developing tighter PAC-Bayes bounds for deep neural networks, these bounds tend to be uninformative about why deep learning works. In this talk, I will also present our compression approach based on quantizing neural network parameters in a linear subspace, which profoundly improves on previous results to provide state-of-the-art generalization bounds on a variety of tasks. We use these tight bounds to better understand the role of model size, equivariance, and the implicit biases of optimization for generalization in deep learning. Notably, our work shows that large models can be compressed to a much greater extent than previously known. Finally, I will discuss the connection between the marginal likelihood and PAC-Bayes bounds for model selection. 

Sanae Lotfi is a PhD student at NYU, advised by Professor Andrew Gordon Wilson. Sanae works on the foundations of deep learning. Her goal is to understand and quantify generalization in deep learning, and use this understanding to build more robust and reliable machine learning models. Sanae's PhD research has been recognized with an ICML Outstanding Paper Award and is generously supported by the Microsoft and DeepMind Fellowships, the Meta AI Mentorship Program and the NYU CDS Fellowship. Prior to joining NYU, Sanae obtained a Master’s degree in applied mathematics from Polytechnique Montreal, where she worked on designing stochastic first and second order algorithms with compelling theoretical and empirical properties for large-scale optimization. The AI Seminar is sponsored by SambaNova Systems.

In Person and Zoom Participation. See announcement.

Event Website:
http://www.cs.cmu.edu/~aiseminar/


Add event to Google
Add event to iCal