Computer Science Thesis Oral

— 4:00pm

Location:
In Person - Traffic21 Classroom, Gates Hillman 6501

Speaker:
SHILPA ANNA GEORGE , Ph.D. Candidate, Computer Science Department, Carnegie Mellon University
https://a4anna.github.io/

Low-Bandwidth Remote Sensing of Rare Events

Remote Sensing enables knowledge discovery from live data collected by unmanned probes.    Planetary exploration, drone surveillance, and underwater sensing are three examples of domains in which remote sensing plays a central role. Near real-time knowledge acquisition of a rare target during such missions is challenging due to three extremes: low bandwidth, novelty of target, and class imbalance. We call the learning that happens in these extreme conditions as Live Learning. This is a new capability at the intersection of edge computing and machine learning. 

It aims to learn a model for a rare target from unlabeled data captured on distributed probes that are only reachable over a low bandwidth network.  The main contribution of this thesis is the design, implementation, and evaluation of Hawk, an interactive model-agnostic live learning system that enables discovery of rare novel phenomena from a stream of extremely skewed unlabeled visual data capture on weakly-connected remote sensing probes. Hawk is designed to optimize use of two critical resources: (a) the network bandwidth from the remote source to the human expert, and (b) the expert’s labeling bandwidth. Live Learning embodies a new semi-supervised learning algorithm to train models on-the-fly to discover instances of a target from very few initial labeled data. We show the effectiveness of Hawk by performing extensive validation on three very demanding publicly-available datasets from the domains mentioned above. Each of these datasets was released within the past few years, and has been used in recent ML research publications in its domain.

Our experiments show that even at bandwidths as low as 12 kbps and a base rate of 0.1%, a team of 7 probes is able to use Hawk to discover up to 87% of the event instances that could have been discovered using a brute-force model.  Such a model is  created from advance knowledge, transmission and labeling of all mission data. Our results show 1.5X—2X improvement in recall when Live Learning in Hawk is combined with recent Few Shot Learning algorithms such as SnaTCHer.   Our results also show how the use of Diversity Sampling can further improve recall in Hawk.

Thesis Committee

  • Mahadev Satyanarayanan (Chair)
  • Reva Ramanan
  • Ameer Talwalkar
  • Padmanabhan Pillai (Intel)

Additional Information

In Person and Zoom Participation. See announcement.


Add event to Google
Add event to iCal