Computer Science Speaking Skills Talk

— 2:00pm

Location:
Virtual Presentation - ET - Remote Access - Zoom

Speaker:
ANDERS ØLAND , Ph.D. Student, Computer Science Department, Carnegie Mellon University
http://andersoland.com/

Deep Layer-wise Learning

We introduce a simple method for layer-wise training of deep neural networks that matches the results achieved with regular full-network backprop on a number of standard computer vision benchmarks. Our method significantly reduces the memory footprint during training (by 70% in most of our experiments), thus making it ideal for use under limited memory resources; e.g. on embedded systems, or mobile and edge devices.  

As a secondary contribution, we show that partition-wise training enables the use of both larger batch sizes, and better model-parallelization schemes. When training deep nets on the ImageNet dataset, we achieved speedups of 30-55% (279% in one severely memory-sensitive instance). Additionally, we show that the training examples become increasingly redundant during training such that a significant amount of the data can be omitted.  

Furthermore, we observe and analyze the effect of implicit interlayer regularization; i.e. that depth regularizes. This phenomenon presents a challenge to layer-wise training on certain benchmarks because we train very shallow networks. However, we find that interlayer regularization can be efficiently simulated in a few simple steps. Our work thus has theoretical implications by adding new understanding of how and what deep neural networks learn. 

Presented in Partial Fulfillment of the CSD Speaking Skills Requirement. 

Zoom Participation. See announcement.


Add event to Google
Add event to iCal