Thesis Oral Defense - Shiva Kaul

— 4:00pm

Location:
In Person - Mauldin Auditorium, Newell-Simon 1305

Speaker:
SHIVA KAUL, Ph.D. Candidate, Computer Science Department, Carnegie Mellon University
https://www.cs.cmu.edu/~skkaul/


Classical Improvements to Modern Machine Learning

The following two dilemmas of modern foundation models concern statistical accuracy and computational efficiency, respectively:

  1. Can language models be trusted to rigorously answer important scientific questions? (Specifically, causal questions from evidence-based medicine which are currently answered through meta-analysis)
  2. Can Transformers and RNNs be replaced by faster state-space models (which are linear across time / sequence length) without sacrificing expressive power?

I present solutions to both. For (1), I adapt conformal prediction to meta-analysis, which may be thought of as a regression from treatment and population features (e.g. "800mg of amiodarone for atrial fibrillation patients") to treatment effect (e.g. "60% chance of reversion to normal heart rhythm"). By using conformal prediction to safely incorporate untrusted data (i.e. observational studies and other background information), this complex regression problem can be satisfactorily addressed even with a small number of randomized controlled trials. The main technical challenges are computationally simplifying full conformal prediction (which is necessary due to the small number of trials) and handling noisy observations (due to the limited number of participants in each trial). 

For (2), I present a general scheme by which nonlinearity across time can be replaced by nonlinearity along depth. This involves stacking linear systems with interposed local corrections. This scheme is fast, constructive, involves no additional parameters, provably converges even in the worst case, and empirically exhibits fast convergence. It can be used to practically develop fast sequence models and to theoretically understand the power of depth. 

Both of these solutions exemplify a broader thesis of developing syntheses between classical machine learning techniques (such as meta-analytic averaging or linear dynamical systems) and modern approaches (such as deep nonlinear regression or Transformers). The high-level goal is to combine the safety and tractability of classical approaches with the accuracy of modern ones through a close (and sometimes surprising) examination of their technical relationship. 

Thesis Committee: 

Geoffrey Gordon (Chair)
Zachary Lipton
Aditi Raghunathan
Ryan Tibshirani (University of California, Berkeley)

Event Website:
https://csd.cmu.edu/calendar/thesis-oral-defense-shiva-kaul


Add event to Google
Add event to iCal