Computer Science Thesis Oral

April 3, 2024 4:00pm — 6:00pm

Location:
In Person and Virtual - ET - Traffic21 Classroom 6501 and Zoom

Speaker:
GIULIO ZHE ZHOU , Ph.D. Candidate, Computer Science Department, Carnegie Mellon University
http://giuliozhou.com/

Building reliable and transparent machine learning systems using structured intermediate representations

Machine learning (ML) is increasingly used to drive complex applications such as web-scale search, content recommendation, autonomous vehicles, and language-based digital assistants. In recent years, these systems have become predominantly data-driven, often underpinned by deep learning models that learn complex functions end-to-end from large amounts of available data. But their purely data-driven nature also makes the learned solutions opaque, sample inefficient, and brittle.

To improve reliability, production solutions often take the form of ML systems that leverage the strengths of deep learning models while handling auxiliary functions such as planning, validation, decision logic, and policy compliance using other components of the system. However, because these methods are often applied post-hoc on fully trained, blackbox deep learning models, their ability to improve system reliability and transparency is limited.

In this thesis, we study how to build more reliable and transparent ML systems using ML models with structured intermediate representations (StructIRs). Compared to non-structured representations such as neural network activations, StructIRs are directly obtained by optimizing a well-defined objective and are structurally constrained (e.g., to normalized embeddings or compilable code) while remaining sufficiently expressive for downstream tasks. They can thus make the resulting ML system more reliable and transparent by increasing modularity and making modeling assumptions explicit.

We explore the role of StructIRs in three different ML systems. In our first work, we use simple probability distributions parameterized by neural networks to build an effective ML-driven datacenter storage policy. In our second work, we show that grounding text generation in a well-structured vector embedding space enables effective transformation of high-level text attributes such as tense and sentiment with simple, interpretable vector arithmetic. In our final work, we conduct human subject studies showing that the stationarity assumptions behind bandit-based recommender systems do not hold in practice, demonstrating the importance of validating the assumptions and structures underlying ML systems.

Thesis Committee:

David G. Andersen (Chair)
J. Zico Kolter
Zachary Lipton
Byron Wallace (Northeastern University)

in Person and Zoom Participation. See announcement.

For More Information:
In Person and Virtual - ET

Add event to Google
Add event to iCal

About Main page

Admissions Main page

Academics Main page

People Main page

Research Main page

Computer Science Thesis Oral

April 3, 2024 4:00pm — 6:00pm