Daniel B. Neill Detection of Spatial and Spatio-Temporal Clusters Degree Type: Ph.D. in Computer Science Advisor(s): Andrew Moore Graduated: August 2006 Abstract: This thesis develops a general and powerful statistical framework for the automatic detection of spatial and space-time clusters. Our "generalized spatial scan" framework is a flexible, model-based framework for accurate and computationally efficient cluster detection in diverse application domains. Through the development of the "fast spatial scan" algorithm and new Bayesian cluster detection methods, we can now detect clusters hundreds or thousands of times faster than previous approaches. More timely detection of emerging clusters (with high detection power and low false positive rates) was made possible by development of "expectation-based" scan statistics, which learn baseline models from past data then detect regions that are anomalous given these expectations. These cluster detection methods were applied to two real-world problem domains: the early detection of emerging disease epidemics, and the detection of clusters of activity in fMRI brain imaging data. One major contribution of this work is the development of the SSS system for nationwide disease surveillance, currently used in daily practice by several state and local health departments. This system receives data (including emergency department records and medication sales) from over 20,000 stores and hospitals nationwide, automatically detects emerging clusters of disease, and reports these results to public health officials. Through retrospective case studies and semi-synthetic testing, we have shown that our system can detect outbreaks significantly faster than previous disease surveillance methods. Thesis Committee: Andrew Moore (Chair) Tom Mitchell Jeff Schneider Gregory Cooper (University of Pittsburgh) Andrew Lawson (University of South Carolina) Jeannette Wing, Head, Computer Science Department Randy Bryant, Dean, School of Computer Science Keywords: Cluster detection, data mining, algorithms, biosurveillance, fMRI CMU-CS-06-142.pdf (2.62 MB) ( 158 pages) Copyright Notice