Christos Faloutsos Fredkin University Professor of Computer Science Website CMU Scholars Page ORCiD Office 7003 Gates and Hillman Centers Email christos@andrew.cmu.edu Phone (412) 268-1457 Department Computer Science Department Administrative Support Person Oliver Moss Research Interests Artificial Intelligence Machine Learning Systems Databases Distributed Systems Advisees Jeremy Lee Catalina Vajiac Saranya Vijayakumar CSD Courses Taught 15826 - Fall, 2024 There are two main focus areas: graph mining and stream mining. In the first, the goal is to find patterns in large graphs, so that we can spot anomalies, communities, patterns and regularities. Graphs appear in many instances: as document-term bipartitegraphs in Information retrieval, as web pages or blogs linking to each other, as customer-product recommendations, as protein-protein regulatory networks, as computer-network traffic, and many more. Our emphasis is on scalability, so that we can handle graphs withthousands and millions of nodes. Research directions include time-evolving graphs, where we have beenusing 'tensors' to find patterns, as well as graphs where the nodes and/or the edges have attributes. The second research area focuses on streams, which are semi-infinitenumerical time series. The setting also has numerous applications, like sensor data monitoring, motion capture data, automatic alerts in the 'self-*' PetaByte storage system, chlorine level monitoring on the drinking water, and several more. The emphasis is to develop algorithms that inspect every measurementonly once, and then discard it, since we can not affort to store the huge volume of historical data. The common threads in both areas are the power-laws and the existenceof self-similarity. Real graphs have skewed, Zipf-like degree distributions, and consist of communities-within-communities. Similarly, real sensor measurements are often bursty, but still self-similar, with bursts within bursts. We use or develop tools that exactly exploit the power laws and self-similarity, to find better patterns and anomalies than standard tools would find. keywords: Database Management Systems, Data Mining, Graphs, Social Networks, Network Security. Publications Preprint 4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on Relational DBs 2024 Wang M, Gan Q, Wipf D, Cai Z, Li N, Tang J, Zhang Y, Zhang Z, Mao Z, Song Y, Wang Y, Li J, Zhang H, Yang G, Qin X, Lei C, Zhang M, Zhang W, Faloutsos C, Zhang Z Conference A Flexible Forecasting Stack 2024 • Proceedings of the VLDB Endowment • 17(12):3883-3892 Januschowski T, Wang Y, Gasthaus J, Rangapuram S, Türkmen C, Zschiegner J, Stella L, Bohlke-Schneider M, Maddix D, Benidis K, Alexandrov A, Faloutsos C, Schelter S Conference DATALORE: Can a Large Language Model Find All Lost Scrolls in a Data Repository? 2024 • Proceedings - International Conference on Data Engineering • 00:5170-5176 Lou Y, Lei C, Qin X, Wang Z, Faloutsos C, Anubhai R, Rangwala H Chapter DIFFFIND: Discovering Differential Equations from Time Series 2024 • Lecture Notes in Computer Science • 14650:175-187 Posam L, Shekhar S, Lee M-C, Faloutsos C Preprint EBV: Electronic Bee-Veterinarian for Principled Mining and Forecasting of Honeybee Time Series 2024 Hossain MS, Faloutsos C, Baer B, Kim H, Tsotras VJ
Preprint 4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on Relational DBs 2024 Wang M, Gan Q, Wipf D, Cai Z, Li N, Tang J, Zhang Y, Zhang Z, Mao Z, Song Y, Wang Y, Li J, Zhang H, Yang G, Qin X, Lei C, Zhang M, Zhang W, Faloutsos C, Zhang Z
Conference A Flexible Forecasting Stack 2024 • Proceedings of the VLDB Endowment • 17(12):3883-3892 Januschowski T, Wang Y, Gasthaus J, Rangapuram S, Türkmen C, Zschiegner J, Stella L, Bohlke-Schneider M, Maddix D, Benidis K, Alexandrov A, Faloutsos C, Schelter S
Conference DATALORE: Can a Large Language Model Find All Lost Scrolls in a Data Repository? 2024 • Proceedings - International Conference on Data Engineering • 00:5170-5176 Lou Y, Lei C, Qin X, Wang Z, Faloutsos C, Anubhai R, Rangwala H
Chapter DIFFFIND: Discovering Differential Equations from Time Series 2024 • Lecture Notes in Computer Science • 14650:175-187 Posam L, Shekhar S, Lee M-C, Faloutsos C
Preprint EBV: Electronic Bee-Veterinarian for Principled Mining and Forecasting of Honeybee Time Series 2024 Hossain MS, Faloutsos C, Baer B, Kim H, Tsotras VJ