Computer Science Speaking Skills Talk

— 12:00pm

In Peron - Traffic21 Classroom, Gates Hillman 6121

RUNTIAN ZHAI , Ph.D. Student, Computer Science Department, Carnegie Mellon University

On the Generalization of Representation Learning

Representation learning with big models is at the center of modern machine learning. However, classical theory cannot explain why big models are so successful. In this talk, we will focus on self-supervised learning based on data augmentation, such as contrastive learning and masked language modeling. Some previous papers tried to derive generalization bounds for these methods, but their bounds still contain the model complexity, making their bounds vacuous.  In contrast, we derive bounds that are completely independent of the model factor. We show that the reason why their bounds still depended on the model is that they used the wrong prior function class. When we use the right prior, a class called induced RKHS that solely depends on the augmentation one chooses to use, we can derive model-free bounds that work for an arbitrary encoder, that is with any architecture and any size.  Presented in Partial Fulfillment of the CSD Speaking Skills Requirement

Add event to Google
Add event to iCal