SCS Faculty Candidate April 18, 2024 10:00am — 12:00pm Location: In Person and Virtual - ET - Newell-Simon 4305 and Zoom Speaker: LIANMIN ZHENG, Ph.D. Student, Department of Electrical Engineering and Computer Sciences , University of California, Berkeley https://lmzheng.net/ Scalable and Efficient Systems for Large Language Models Large Language Models (LLMs) have been driving recent breakthroughs in AI. These advancements would not have been possible without the support of scalable and efficient infrastructure systems. In this talk, I will introduce several systems I have designed and built to support the entire model lifecycle, from training to deployment to evaluation. First, I will present Alpa, a system for scalable model-parallel training that automatically generates execution plans unifying inter- and intra-operator parallelism. Next, I will discuss SGLang, an efficient deployment system covering both the frontend programming interface and backend runtime optimizations for high-performance inference. Finally, I will complete the model lifecycle by presenting our model evaluation efforts, including the crowdsourced live benchmark platform, Chatbot Arena, and the automatic evaluation pipeline, LLM-as-a-Judge. These projects have collectively laid a solid foundation for large language model systems, being widely adopted by leading LLM developers and companies. I will conclude by outlining some future directions, such as a programmatic and composable software stack for using LLMs and further improvements with synthetic data. —Lianmin Zheng is a Ph.D. student in the EECS department at UC Berkeley, advised by Ion Stoica and Joseph E. Gonzalez. His research interests include machine learning systems, large language models, compilers, and distributed systems. He builds full-stack, scalable, and efficient systems to advance the development of AI. He co-founded LMSYS.org, where he leads impactful open-source large language model projects such as Vicuna and Chatbot Arena, which have received millions of downloads and served millions of users. He has received a Meta Ph.D. Fellowship, an IEEE Micro Best Paper Award, and an a16z open-source AI grant. Faculty Host: Tianqi Chen In Person and Zoom Participation. See announcement. Add event to Google Add event to iCal