Artificial Intelligence Seminar - Xiangxiang Xu

— 1:00pm

Location:
In Person and Virtual - ET - ASA Conference Room, Gates Hillman 6115 and Zoom

Speaker:
XIANGXIANG XU, Postdoctoral Associate, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology
https://xiangxiangxu.mit.edu/


Dependence Induced Representation Learning

Despite the vast progress in deep learning practice, theoretical understandings of learned feature representations remain limited. In this talk, we discuss three fundamental questions from a unified statistical perspective:

  1.  What representations carry useful information?
  2.  How are representations learned from distinct algorithms related?
  3.  Can we separate representation learning from solving specific tasks?

In particular, we formalize representations that extract statistical dependence from data, termed dependence-induced representations. We prove that representations are dependence-induced if and only if they can be learned from specific features defined by Hirschfeld–Gebelein–Rényi (HGR) maximal correlation. This separation theorem signifies the key role of HGR features in representation learning and enables a modular design of learning algorithms. We further introduce the algorithm design for learning HGR features and demonstrate how their mathematical structures enable them to simultaneously achieve several design objectives, including minimal sufficiency (Tishby's information bottleneck), information maximization, enforcing uncorrelated features (VICReg), and encoding information at different granularities (Matryoshka representation learning). We demonstrate that based on HGR features, we can obtain various representations learned by existing practices, including cross-entropy or hinge loss minimization, non-negative feature learning, neural density ratio estimators, and their regularized variants. Our development also provides a statistical interpretation of the neural collapse phenomenon observed in deep classifiers. We conclude the talk by discussing the implications of our analyses, including hyperparameter tuning during inference. 

— 

Xiangxiang Xu received the B.Eng. and Ph.D. degrees in electronic engineering from Tsinghua University, Beijing, China, in 2014 and 2020, respectively. He is a postdoctoral associate in the Department of EECS at MIT. His research focuses on information theory, statistical learning, representation learning, and their applications in understanding and developing learning algorithms. He is a recipient of the 2016 IEEE PES Student Prize Paper Award in Honor of T. Burke Hayes and the 2024 ITA (Information Theory and Applications) Workshop Sand Award. 

In Person and Zoom Participation.  See announcement.

Event Website:
http://www.cs.cmu.edu/~aiseminar/


Add event to Google
Add event to iCal