Artificial Intelligence Seminar

— 1:00pm

Location:
In Person and Virtual - ET - ASA Conference Room, Gates Hillman 6115 and Zoom

Speaker:
NICOLAS ROBERTS , Ph.D. Student, Department of Computer Science, University of Wisconsin-Madison
https://nick11roberts.science/

Geometry-Aware Adaptation for Pretrained Models

Machine learning models—including prominent zero-shot models—are often trained on datasets whose labels are only a small proportion of a larger label space. Such spaces are commonly equipped with a metric that relates the labels via distances between them. We propose a simple approach to exploit this information to adapt the trained model to reliably predict new classes—or, in the case of zero-shot prediction, to improve its performance—without any additional training.

Our technique is a drop-in replacement of the standard prediction rule, swapping arg max with the Fréchet mean. We provide a comprehensive theoretical analysis for this approach, studying (i) learning-theoretic results trading off label space diameter, sample complexity, and model dimension, (ii) characterizations of the full range of scenarios in which it is possible to predict any unobserved class, and (iii) an optimal active learning-like next class selection procedure to obtain optimal training classes for when it is not possible to predict the entire range of unobserved classes.

Empirically, using easily available external metrics, our proposed approach, LOKI, gains up to 29.7% relative improvement over SimCLR on ImageNet and scales to hundreds of thousands of classes. When no such metric is available, LOKI can use self-derived metrics from class embeddings and obtains a 10.5% improvement on pre-trained zero-shot models such as CLIP.  

Nicholas Roberts is a third year Ph.D. student at the University of Wisconsin-Madison advised by Frederic Sala. This past summer, he completed an internship with the Physics of AGI research group at Microsoft Research led by Sebastien Bubeck, working on large language models. Previously, he completed his M.S. in the Machine Learning Department at CMU, working with Ameet Talwalkar and Zack Lipton.

Nicholas’ research is motivated by the need to democratize machine learning and foundation models to handle the long tail of emerging ML tasks in the sciences and beyond. In order to use these models to solve high-impact problems in the sciences, his work aims to solve two main challenges: (1) determine what additional data to provide them and understand how it interacts with pretraining data, and (2) automate the process of adapting them to new problems. To address these challenges, he is focused on the intersection of data-centric ML (which aims to solve 1) and automated machine learning (AutoML) (which aims to solve 2), or more concisely data-centric AutoML. As a result of these motivating challenges, his work on developing the foundations of data-centric AutoML has a focus on diverse ML tasks that are far afield from standard ML domains. These often include problems related to solving PDEs, protein folding, climate modeling, and beyond.

The AI Seminar is sponsored by SambaNova Systems.

In Person and Zoom Participation. See announcement.

Event Website:
http://www.cs.cmu.edu/~aiseminar/


Add event to Google
Add event to iCal