Wednesday, December 4, 2024 - 4:00pm
Event Calendar Category
Other LIDS Events
Speaker Name
Vinith Suriyakumar
Affiliation
LIDS
Building and Room number
32-D650
Building and Room Number
LIDS Lounge
“Unstable Unlearning: The Hidden Risk of Concept Resurgence in Diffusion Models”
Text-to-image diffusion models rely on massive, web-scale datasets. Training them from scratch is computationally expensive, and as a result, developers often prefer to make incremental updates to existing models. These updates often compose fine-tuning steps (to learn new concepts or improve model performance) with "unlearning" steps (to "forget" existing concepts, such as copyrighted works or explicit content). In this work, we demonstrate a critical and previously unknown vulnerability that arises in this paradigm: even under benign, non-adversarial conditions, fine-tuning a text-to-image diffusion model on seemingly unrelated images can cause it to "relearn" concepts that were previously "unlearned." Our findings underscore the fragility of composing incremental model updates, and raise serious new concerns about current approaches to ensuring the safety and alignment of text-to-image diffusion models. This is joint work with Rohan Alur, Ayush Sekhari, Manish Raghavan, and Ashia Wilson.
Vinith Suriyakumar is a fourth year PhD student advised by Dr. Ashia Wilson and Dr. Marzyeh Ghassemi. His research focuses on building algorithms and using rigorous empirical studies to address and understand the security, privacy, and safety concerns around the use of generative models. Currently he is working on issues surrounding copyright, unlearning, encrypted training/inference, and backdoors. His previous research has been published at NeurIPS, ICML, ICLR, and FAccT and has won multiple oral awards and a Best Paper Award.
About LIDS and Stats Tea Talks
Tea talks are informal 20-minute-long talks for the purpose of sharing ideas and creating awareness about that could be of interest to the LIDS community.
Each session is followed by light refreshments. Email lids_stats_teas[at]mit[dot]edu for information about LIDS & Stats Tea Talks.
Kind regards,
LIDS & Stats Tea Talks Committee
Maison Clouatre, Subham Saha, Ashkan Soleymani, Jia Wan

 
                        
