Computer Science Thesis Proposal

Wednesday, September 26, 2018 - 11:00am to 12:00pm


Reddy Conference Room 4405 Gates Hillman Centers



Differentiable Optimization-Based Inference for Machine Learning

Speaker: Brandon Amos

Location: GHC 4405

Differentiable Optimization-Based Inference for Machine Learning

This thesis presents machine learning models, paradigms, and primitive operations that involve using optimization as part of the inference procedure. We show why these techniques provide useful modeling tools that subsume many well-known standard operations. We then discuss and propose solutions to challenges that arise when doing inference and learning in these models.

The first portion describes the input-convex neural network (ICNN) architecture that helps make inference and learning in deep energy-based models and structured prediction more tractable. These are scalar-valued (potentially deep) neural networks with constraints on the network parameters such that the output of the network is a convex function of (some of) the inputs. The networks allow for efficient inference via optimization over some inputs to the network given others, and can be applied to settings including structured prediction, data imputation, and reinforcement learning. We lay the basic groundwork for these models, proposing methods for inference, optimization and learning, and analyze their representational power. We show that many existing neural network architectures can be made input-convex with a minor modification, and develop specialized optimization algorithms tailored to this setting.

The next portion describes OptNet, a network architecture that integrates optimization problems as individual layers in larger end-to-end trainable deep networks. These layers encode constraints and complex dependencies between the hidden states that traditional convolutional and fully-connected layers often cannot capture. We show to exactly differentiate through these layers and have developed a highly efficient solver that exploits fast GPU-based operations within a primal-dual interior point method, and which provides backpropagation gradients with virtually no additional cost on top of the solve.

We propose to continue studying these optimization-based models in two research directions and one engineering direction:

  1. We will study the use of optimization and deep energy-based methods in combinatorial and discrete output spaces.
  2. We will study the use of control as a differentiable policy class that can be used with reinforcement learning.
  3. We will make a cvxpy PyTorch layer to enable quick prototyping of the optimization-based layers this thesis studies.

Thesis Committee:
J. Zico Kolter (Chair)
Barnabas Poczos
Jeff Schneider
Nando de Freitas (DeepMind)
Vladlen Koltun (Intel Labs)

Copy of Thesis Summary

For More Information, Contact:


Thesis Proposal