Artificial Intelligence Seminar - Robin Jia

— 1:00pm

Location:
In Person and Virtual - ET - ASA Conference Room, Gates Hillman 6115 and Zoom

Speaker:
ROBIN JIA, Assistant Professor of Computer Science, Thomas Lord Department of Computer Science, University of Southern California
https://robinjia.github.io/


Auditing, understanding, and leveraging large language models

The rise of large language models offers opportunities to both scientifically study these complex systems and apply them in novel ways. In this talk, I will describe my group’s recent work along these lines. First, I will discuss data watermarks, a statistically rigorous technique for auditing a language model’s training data based only on black-box model queries. Then, we will investigate how language models memorize training data: based on results from two complementary benchmarks, I will demonstrate the viability of localizing memorized data to a sparse subset of neurons. Next, I will provide a mechanistic account of how pre-trained language models use Fourier features to solve arithmetic problems, and how pre-training plays a critical role in these mechanisms. Finally, I will show how to leverage the complementary strengths of large language models and symbolic solvers to handle complex planning tasks. 

— 

Robin Jia is an Assistant Professor of Computer Science at the University of Southern California. He received his Ph.D. in Computer Science from Stanford University, where he was advised by Percy Liang. He has also spent time as a visiting researcher at Facebook AI Research, working with Luke Zettlemoyer and Douwe Kiela. He is interested broadly in natural language processing and machine learning, with a focus on scientifically understanding NLP models in order to improve their reliability. Robin’s work has received best paper awards at ACL and EMNLP. 

In Person and Zoom Participation.  See announcement.