SCS Faculty Candidate - Mengzhou Xia April 10, 2025 10:00am — 12:00pm Location: In Person and Virtual - ET - Newell-Simon 4305 and Zoom Speaker: MENGZHOU XIA , Ph.D. Student, Department of Computer Science, Princeton University https://xiamengzhou.github.io/ Advancing the Pareto Frontier of Open Language Models Large language models (LLMs) have reshaped AI by enabling breakthroughs in language understanding, reasoning, and diverse applications. However, their massive computational demands and the proprietary nature of leading models hinder broad accessibility and customization. My work addresses these challenges by optimizing the use of existing compute, data, and models to push the Pareto frontier for LLM training. In doing so, it not only produces stronger language models but also offers universal approaches that support effective customization and advance our scientific understanding of model training. First, I will discuss how structured pruning can be leveraged to pre-train compact, high-performing models at a fraction of the usual pre-training cost, demonstrating its effectiveness in pushing the Pareto frontier for general-purpose pre-training. Next, I will turn to the post-training phase to explore the critical role of data in shaping model behavior, presenting principled data optimization techniques that enhance models’ capabilities, safety, and transparency—showing that “less is more” when it comes to constructing effective training datasets. Finally, I will introduce novel post-training approaches that more effectively align language models with desired behaviors and objectives. By revealing gaps in the reasoning abilities of even proprietary models, I outline future directions for building AI systems with enhanced reasoning capabilities—focusing on broader data synthesis through agentic processes and enabling advanced applications. — Mengzhou Xia is a final-year PhD student at Princeton University, advised by Prof. Danqi Chen. She develops efficient training methods for compact, capable, and open language models by optimizing the use of compute, data, and existing models, enabling easy and effective customization. Her open-source models are widely used in the community. Mengzhou is a recipient of the 2024 Apple Scholars in AI/ML PhD Fellowship, and the 2022 Bloomberg Data Science PhD Fellowship. She has also been awarded as a 2024 EECS Rising Star at MIT.Faculty Hosts: Max Simchowitz, Chris Donahue Joint Machine Learning Department and Computer Science Department In Person and Zoom Participation. See announcement. → Attendance at this talk is restricted to members of the SCS community and relevant CMU stakeholders. Add event to Google Add event to iCal