Special CyLab Seminar - Mihai Christodorescu April 10, 2025 1:00pm — 2:00pm Location: In Person - Newell-Simon 3305 Speaker: MIHAI CHRISTODORESCU , Research Scientist, Google https://www.linkedin.com/in/mihaichristodorescu Can We Use Language Models for Security Reviews Yet? Large Language Models (LLMs) have shown success in text generation and, increasingly, in code generation and related tasks. We investigate applying LLMs to code comprehension, specifically for security and privacy code reviews. While existing models trained on clean code repositories (like GitHub and Stack Overflow) perform well at summarizing and explaining standard code, enabling AI assistance in code reviews, they falter when faced with the unusual or adversarial code typical in security analysis. Using counterfactual testing, we evaluated LLM understanding of programming concepts and found significant gaps, particularly concerning data flow and control flow. To address this, we developed a framework that automatically generates a synthetic dataset of mutated code. This reward-guided approach systematically creates diverse and realistically obfuscated samples. We demonstrate that incorporating this synthetic data into training significantly improves code ML model robustness and performance on obfuscated code, paving the way for more reliable AI tools in security-critical domains. — Dr. Mihai Christodorescu is a Research Scientist at Google, where he focuses on software security and privacy, especially for the mobile domain. His research interests are in fundamental approaches to computer security and privacy problems by combining methods from multiple domains, from programming languages, to machine learning, behavioral modeling, and formal methods. Most recently, he focused on translating progress in user authentication to software service authentication and on designing cryptographic techniques to allow users to disclose their personal data in flexible ways. He received his Ph.D. in Computer Sciences from the University of Wisconsin–Madison in 2007. Dr. Christodorescu holds 25 patents and has published more than 35 papers in several international conferences and journals, including the IEEE Symposium on Security and Privacy (S&P), the ACM Conference on Computer and Communications Security (CCS), the USENIX Security Symposium, the Annual Computer Security Applications Conference (ACSAC), and many more. Faculty Host: Limin Jia Event Website: https://www.cylab.cmu.edu/events/ Add event to Google Add event to iCal