Computer Science Speaking Skills Talk

— 1:30pm

ANANYA JOSHI , Ph.D. Student, Computer Science Department, Carnegie Mellon University

Real-Time Point Outlier Detection in Epidemiological Streams

Irregularities in epidemiological data streams are common and adversely impact forecasting applications, public trust in data, and public health decision-making.  In response, the Delphi Research Group at Carnegie Mellon University, which curates and publishes dozens of indicator streams for various illnesses at the US county level, has emphasized detecting and communicating interesting data irregularities quickly. This setting is challenging for popular off-the-shelf real-time outlier or anomaly detection methods because public health data is often noisy, non-stationary, revised, and sometimes has limited historical data, like in the case of COVID-19 indicator streams.

FlaSH, part of our data quality alerting system, addresses these challenges to produce a daily tailored, ranked list of data flags from over 1 million recent data points. In this talk, I introduce nuances of public health data that make outlier detection difficult, explain how they influence the design of FlaSH, and step through our resulting point outlier detection method for flagging “interesting” points. We compare this method to standard real-time outlier detection methods and show examples of important irregularities that the FlaSH system flagged for human review.

Presented in Partial Fulfillment of the CSD Speaking Skills Requirement.

Add event to Google
Add event to iCal