VASC Seminar - Niv Cohen

February 24, 2025 3:30pm — 4:30pm

Location:
In Person - Newell-Simon 3305

Speaker:
NIV COHEN , Postdoctoral Researcher, Computer Science and Engineering, Tandon School of Engineering, New York University
https://nivc.github.io/

Discovering and Erasing Undesired Concepts

The rapid growth of generative models allows an ever-increasing variety of capabilities. Yet, these models may also produce undesired content such as unsafe or misleading images, private information, or copyrighted material.

In this talk, I will discuss practical methods to prevent undesired generations. First, I will show how the challenge of avoiding undesired generations manifested itself in a simple Capture-the-Flag LLM setting, where even our top defense strategy was breached. Next, I will demonstrate a similar vulnerability in state-of-the-art concept erasure methods for Text-to-Image models. Finally, I will distinguish between erasure through Guidance-Based Avoidance and Destruction-Based Removal methods. I will discuss the trade-offs of each approach and their behavior in various settings.

—

Niv Cohen is a postdoctoral researcher at New York University hosted by Prof. Chinmay Hegde. He received a BSc in mathematics with physics as part of the Technion Excellence Program. He received his PhD in computer science from the Hebrew University of Jerusalem, advised by Prof. Yedid Hoshen. Niv was awarded the Israeli data science scholarship for outstanding postdoctoral fellows (VATAT). He is interested in anomaly detection, representation learning, and AI safety for Vision & Language models.

Event Website:
https://www.ri.cmu.edu/event/discovering-and-erasing-undesired-concepts/

Add event to Google
Add event to iCal

About Main page

Admissions Main page

Academics Main page

People Main page

Research Main page

VASC Seminar - Niv Cohen

February 24, 2025 3:30pm — 4:30pm