Search Results for author: Stefano Ermon

Found 299 papers, 181 papers with code

Bridging the Gap Between f-GANs and Wasserstein GANs

1 code implementation ICML 2020 Jiaming Song, Stefano Ermon

Generative adversarial networks (GANs) variants approximately minimize divergences between the model and the data distribution using a discriminator.

Density Ratio Estimation Image Generation +1

Exploring Diffusion Transformer Designs via Grafting

1 code implementation5 Jun 2025 Keshigeyan Chandrasegaran, Michael Poli, Daniel Y. Fu, Dongjun Kim, Lea M. Hadzic, Manling Li, Agrim Gupta, Stefano Massaroli, Azalia Mirhoseini, Juan Carlos Niebles, Stefano Ermon, Li Fei-Fei

Designing model architectures requires decisions such as selecting operators (e. g., attention, convolution) and configurations (e. g., depth, width).

Reviving Any-Subset Autoregressive Models with Principled Parallel Sampling and Speculative Decoding

1 code implementation29 Apr 2025 Gabe Guo, Stefano Ermon

In arbitrary-order language models, it is an open question how to sample tokens in parallel from the correct joint distribution.

Code Generation Density Estimation +3

Sharpe Ratio-Guided Active Learning for Preference Optimization in RLHF

no code implementations28 Mar 2025 Syrine Belakaria, Joshua Kazdan, Charles Marx, Chris Cundy, Willie Neiswanger, Sanmi Koyejo, Barbara E. Engelhardt, Stefano Ermon

In this work, we propose an active learning approach to efficiently select prompt and preference pairs using a risk assessment strategy based on the Sharpe Ratio.

Active Learning

Preference-Guided Diffusion for Multi-Objective Offline Optimization

no code implementations21 Mar 2025 Yashas Annadani, Syrine Belakaria, Stefano Ermon, Stefan Bauer, Barbara E Engelhardt

Offline multi-objective optimization aims to identify Pareto-optimal solutions given a dataset of designs and their objective values.

Diversity

Inductive Moment Matching

2 code implementations10 Mar 2025 Linqi Zhou, Stefano Ermon, Jiaming Song

Diffusion models and Flow Matching generate high-quality samples but are slow at inference, and distilling them into few-step models often leads to instability and extensive tuning.

Data Unlearning in Diffusion Models

1 code implementation2 Mar 2025 Silas Alberti, Kenan Hasanaliyev, Manav Shah, Stefano Ermon

Existing concept unlearning techniques require an anchor prompt/class/distribution to guide unlearning, which is not available in the data unlearning setting.

Machine Unlearning Memorization

FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in LLMs Elicits Effective Personalization to Real Users

no code implementations26 Feb 2025 Anikait Singh, Sheryl Hsu, Kyle Hsu, Eric Mitchell, Stefano Ermon, Tatsunori Hashimoto, Archit Sharma, Chelsea Finn

Overall, FSPO achieves an 87% Alpaca Eval winrate on average in generating responses that are personalized to synthetic users and a 72% winrate with real human users in open-ended question answering.

In-Context Learning Meta-Learning +1

Training-Free Safe Denoisers for Safe Use of Diffusion Models

no code implementations11 Feb 2025 Mingyu Kim, Dongjun Kim, Amman Yusuf, Stefano Ermon, Mi Jung Park

There is growing concern over the safety of powerful diffusion models (DMs), as they are often misused to produce inappropriate, not-safe-for-work (NSFW) content or generate copyrighted material or data of individuals who wish to be forgotten.

Image Generation Negation +1

TFG-Flow: Training-free Guidance in Multimodal Generative Flow

1 code implementation24 Jan 2025 Haowei Lin, Shanda Li, Haotian Ye, Yiming Yang, Stefano Ermon, Yitao Liang, Jianzhu Ma

Given an unconditional generative model and a predictor for a target property (e. g., a classifier), the goal of training-free guidance is to generate samples with desirable target properties without additional training.

Drug Design

Humanity's Last Exam

no code implementations24 Jan 2025 Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li, Josephina Hu, Hugh Zhang, Chen Bo Calvin Zhang, Mohamed Shaaban, John Ling, Sean Shi, Michael Choi, Anish Agrawal, Arnav Chopra, Adam Khoja, Ryan Kim, Richard Ren, Jason Hausenloy, Oliver Zhang, Mantas Mazeika, Dmitry Dodonov, Tung Nguyen, Jaeho Lee, Daron Anderson, Mikhail Doroshenko, Alun Cennyth Stokes, Mobeen Mahmood, Oleksandr Pokutnyi, Oleg Iskra, Jessica P. Wang, John-Clark Levin, Mstyslav Kazakov, Fiona Feng, Steven Y. Feng, Haoran Zhao, Michael Yu, Varun Gangal, Chelsea Zou, Zihan Wang, Serguei Popov, Robert Gerbicz, Geoff Galgon, Johannes Schmitt, Will Yeadon, Yongki Lee, Scott Sauers, Alvaro Sanchez, Fabian Giska, Marc Roth, Søren Riis, Saiteja Utpala, Noah Burns, Gashaw M. Goshu, Mohinder Maheshbhai Naiya, Chidozie Agu, Zachary Giboney, Antrell Cheatom, Francesco Fournier-Facio, Sarah-Jane Crowson, Lennart Finke, Zerui Cheng, Jennifer Zampese, Ryan G. Hoerr, Mark Nandor, Hyunwoo Park, Tim Gehrunger, Jiaqi Cai, Ben McCarty, Alexis C Garretson, Edwin Taylor, Damien Sileo, Qiuyu Ren, Usman Qazi, Lianghui Li, Jungbae Nam, John B. Wydallis, Pavel Arkhipov, Jack Wei Lun Shi, Aras Bacho, Chris G. Willcocks, Hangrui Cao, Sumeet Motwani, Emily de Oliveira Santos, Johannes Veith, Edward Vendrow, Doru Cojoc, Kengo Zenitani, Longke Tang, Yuqi Li, Joshua Vendrow, Natanael Wildner Fraga, Vladyslav Kuchkin, Andrey Pupasov Maksimov, Pierre Marion, Denis Efremov, Jayson Lynch, Kaiqu Liang, Aleksandar Mikov, Andrew Gritsevskiy, Julien Guillod, Gözdenur Demir, Dakotah Martinez, Ben Pageler, Kevin Zhou, Saeed Soori, Ori Press, Henry Tang, Paolo Rissone, Sean R. Green, Lina Brüssel, Moon Twayana, Aymeric Dieuleveut, Joseph Marvin Imperial, Ameya Prabhu, Jinzhou Yang, Nick Crispino, Arun Rao, Dimitri Zvonkine, Gabriel Loiseau, Mikhail Kalinin, Marco Lukas, Ciprian Manolescu, Nate Stambaugh, Subrata Mishra, Tad Hogg, Carlo Bosio, Brian P Coppola, Julian Salazar, Jaehyeok Jin, Rafael Sayous, Stefan Ivanov, Philippe Schwaller, Shaipranesh Senthilkuma, Andres M Bran, Andres Algaba, Kelsey Van den Houte, Lynn Van Der Sypt, Brecht Verbeken, David Noever, Alexei Kopylov, Benjamin Myklebust, Bikun Li, Lisa Schut, Evgenii Zheltonozhskii, Qiaochu Yuan, Derek Lim, Richard Stanley, Tong Yang, John Maar, Julian Wykowski, Martí Oller, Anmol Sahu, Cesare Giulio Ardito, Yuzheng Hu, Ariel Ghislain Kemogne Kamdoum, Alvin Jin, Tobias Garcia Vilchis, Yuexuan Zu, Martin Lackner, James Koppel, Gongbo Sun, Daniil S. Antonenko, Steffi Chern, Bingchen Zhao, Pierrot Arsene, Joseph M Cavanagh, Daofeng Li, Jiawei Shen, Donato Crisostomi, Wenjin Zhang, Ali Dehghan, Sergey Ivanov, David Perrella, Nurdin Kaparov, Allen Zang, Ilia Sucholutsky, Arina Kharlamova, Daniil Orel, Vladislav Poritski, Shalev Ben-David, Zachary Berger, Parker Whitfill, Michael Foster, Daniel Munro, Linh Ho, Shankar Sivarajan, Dan Bar Hava, Aleksey Kuchkin, David Holmes, Alexandra Rodriguez-Romero, Frank Sommerhage, Anji Zhang, Richard Moat, Keith Schneider, Zakayo Kazibwe, Don Clarke, Dae Hyun Kim, Felipe Meneguitti Dias, Sara Fish, Veit Elser, Tobias Kreiman, Victor Efren Guadarrama Vilchis, Immo Klose, Ujjwala Anantheswaran, Adam Zweiger, Kaivalya Rawal, Jeffery Li, Jeremy Nguyen, Nicolas Daans, Haline Heidinger, Maksim Radionov, Václav Rozhoň, Vincent Ginis, Christian Stump, Niv Cohen, Rafał Poświata, Josef Tkadlec, Alan Goldfarb, Chenguang Wang, Piotr Padlewski, Stanislaw Barzowski, Kyle Montgomery, Ryan Stendall, Jamie Tucker-Foltz, Jack Stade, T. Ryan Rogers, Tom Goertzen, Declan Grabb, Abhishek Shukla, Alan Givré, John Arnold Ambay, Archan Sen, Muhammad Fayez Aziz, Mark H Inlow, Hao He, Ling Zhang, Younesse Kaddar, Ivar Ängquist, Yanxu Chen, Harrison K Wang, Kalyan Ramakrishnan, Elliott Thornley, Antonio Terpin, Hailey Schoelkopf, Eric Zheng, Avishy Carmi, Ethan D. L. Brown, Kelin Zhu, Max Bartolo, Richard Wheeler, Martin Stehberger, Peter Bradshaw, JP Heimonen, Kaustubh Sridhar, Ido Akov, Jennifer Sandlin, Yury Makarychev, Joanna Tam, Hieu Hoang, David M. Cunningham, Vladimir Goryachev, Demosthenes Patramanis, Michael Krause, Andrew Redenti, David Aldous, Jesyin Lai, Shannon Coleman, Jiangnan Xu, Sangwon Lee, Ilias Magoulas, Sandy Zhao, Ning Tang, Michael K. Cohen, Orr Paradise, Jan Hendrik Kirchner, Maksym Ovchynnikov, Jason O. Matos, Adithya Shenoy, Michael Wang, Yuzhou Nie, Anna Sztyber-Betley, Paolo Faraboschi, Robin Riblet, Jonathan Crozier, Shiv Halasyamani, Shreyas Verma, Prashant Joshi, Eli Meril, Ziqiao Ma, Jérémy Andréoletti, Raghav Singhal, Jacob Platnick, Volodymyr Nevirkovets, Luke Basler, Alexander Ivanov, Seri Khoury, Nils Gustafsson, Marco Piccardo, Hamid Mostaghimi, Qijia Chen, Virendra Singh, Tran Quoc Khánh, Paul Rosu, Hannah Szlyk, Zachary Brown, Himanshu Narayan, Aline Menezes, Jonathan Roberts, William Alley, Kunyang Sun, Arkil Patel, Max Lamparth, Anka Reuel, Linwei Xin, Hanmeng Xu, Jacob Loader, Freddie Martin, Zixuan Wang, Andrea Achilleos, Thomas Preu, Tomek Korbak, Ida Bosio, Fereshteh Kazemi, Ziye Chen, Biró Bálint, Eve J. Y. Lo, Jiaqi Wang, Maria Inês S. Nunes, Jeremiah Milbauer, M Saiful Bari, ZiHao Wang, Behzad Ansarinejad, Yewen Sun, Stephane Durand, Hossam Elgnainy, Guillaume Douville, Daniel Tordera, George Balabanian, Hew Wolff, Lynna Kvistad, Hsiaoyun Milliron, Ahmad Sakor, Murat Eron, Andrew Favre D. O., Shailesh Shah, Xiaoxiang Zhou, Firuz Kamalov, Sherwin Abdoli, Tim Santens, Shaul Barkan, Allison Tee, Robin Zhang, Alessandro Tomasiello, G. Bruno De Luca, Shi-Zhuo Looi, Vinh-Kha Le, Noam Kolt, Jiayi Pan, Emma Rodman, Jacob Drori, Carl J Fossum, Niklas Muennighoff, Milind Jagota, Ronak Pradeep, Honglu Fan, Jonathan Eicher, Michael Chen, Kushal Thaman, William Merrill, Moritz Firsching, Carter Harris, Stefan Ciobâcă, Jason Gross, Rohan Pandey, Ilya Gusev, Adam Jones, Shashank Agnihotri, Pavel Zhelnov, Mohammadreza Mofayezi, Alexander Piperski, David K. Zhang, Kostiantyn Dobarskyi, Roman Leventov, Ignat Soroko, Joshua Duersch, Vage Taamazyan, Andrew Ho, Wenjie Ma, William Held, Ruicheng Xian, Armel Randy Zebaze, Mohanad Mohamed, Julian Noah Leser, Michelle X Yuan, Laila Yacar, Johannes Lengler, Katarzyna Olszewska, Claudio Di Fratta, Edson Oliveira, Joseph W. Jackson, Andy Zou, Muthu Chidambaram, Timothy Manik, Hector Haffenden, Dashiell Stander, Ali Dasouqi, Alexander Shen, Bita Golshani, David Stap, Egor Kretov, Mikalai Uzhou, Alina Borisovna Zhidkovskaya, Nick Winter, Miguel Orbegozo Rodriguez, Robert Lauff, Dustin Wehr, Colin Tang, Zaki Hossain, Shaun Phillips, Fortuna Samuele, Fredrik Ekström, Angela Hammon, Oam Patel, Faraz Farhidi, George Medley, Forough Mohammadzadeh, Madellene Peñaflor, Haile Kassahun, Alena Friedrich, Rayner Hernandez Perez, Daniel Pyda, Taom Sakal, Omkar Dhamane, Ali Khajegili Mirabadi, Eric Hallman, Kenchi Okutsu, Mike Battaglia, Mohammad Maghsoudimehrabani, Alon Amit, Dave Hulbert, Roberto Pereira, Simon Weber, Handoko, Anton Peristyy, Stephen Malina, Mustafa Mehkary, Rami Aly, Frank Reidegeld, Anna-Katharina Dick, Cary Friday, Mukhwinder Singh, Hassan Shapourian, Wanyoung Kim, Mariana Costa, Hubeyb Gurdogan, Harsh Kumar, Chiara Ceconello, Chao Zhuang, Haon Park, Micah Carroll, Andrew R. Tawfeek, Stefan Steinerberger, Daattavya Aggarwal, Michael Kirchhof, Linjie Dai, Evan Kim, Johan Ferret, Jainam Shah, Yuzhou Wang, Minghao Yan, Krzysztof Burdzy, Lixin Zhang, Antonio Franca, Diana T. Pham, Kang Yong Loh, Joshua Robinson, Abram Jackson, Paolo Giordano, Philipp Petersen, Adrian Cosma, Jesus Colino, Colin White, Jacob Votava, Vladimir Vinnikov, Ethan Delaney, Petr Spelda, Vit Stritecky, Syed M. Shahid, Jean-Christophe Mourrat, Lavr Vetoshkin, Koen Sponselee, Renas Bacho, Zheng-Xin Yong, Florencia de la Rosa, Nathan Cho, Xiuyu Li, Guillaume Malod, Orion Weller, Guglielmo Albani, Leon Lang, Julien Laurendeau, Dmitry Kazakov, Fatimah Adesanya, Julien Portier, Lawrence Hollom, Victor Souza, Yuchen Anna Zhou, Julien Degorre, Yiğit Yalın, Gbenga Daniel Obikoya, Rai, Filippo Bigi, M. C. Boscá, Oleg Shumar, Kaniuar Bacho, Gabriel Recchia, Mara Popescu, Nikita Shulga, Ngefor Mildred Tanwie, Thomas C. H. Lux, Ben Rank, Colin Ni, Matthew Brooks, Alesia Yakimchyk, Huanxu, Liu, Stefano Cavalleri, Olle Häggström, Emil Verkama, Joshua Newbould, Hans Gundlach, Leonor Brito-Santana, Brian Amaro, Vivek Vajipey, Rynaa Grover, Ting Wang, Yosi Kratish, Wen-Ding Li, Sivakanth Gopi, Andrea Caciolai, Christian Schroeder de Witt, Pablo Hernández-Cámara, Emanuele Rodolà, Jules Robins, Dominic Williamson, Brad Raynor, Hao Qi, Ben Segev, Jingxuan Fan, Sarah Martinson, Erik Y. Wang, Kaylie Hausknecht, Michael P. Brenner, Mao Mao, Christoph Demian, Peyman Kassani, Xinyu Zhang, David Avagian, Eshawn Jessica Scipio, Alon Ragoler, Justin Tan, Blake Sims, Rebeka Plecnik, Aaron Kirtland, Omer Faruk Bodur, D. P. Shinde, Yan Carlos Leyva Labrador, Zahra Adoul, Mohamed Zekry, Ali Karakoc, Tania C. B. Santos, Samir Shamseldeen, Loukmane Karim, Anna Liakhovitskaia, Nate Resman, Nicholas Farina, Juan Carlos Gonzalez, Gabe Maayan, Earth Anderson, Rodrigo De Oliveira Pena, Elizabeth Kelley, Hodjat Mariji, Rasoul Pouriamanesh, Wentao Wu, Ross Finocchio, Ismail Alarab, Joshua Cole, Danyelle Ferreira, Bryan Johnson, Mohammad Safdari, Liangti Dai, Siriphan Arthornthurasuk, Isaac C. McAlister, Alejandro José Moyano, Alexey Pronin, Jing Fan, Angel Ramirez-Trinidad, Yana Malysheva, Daphiny Pottmaier, Omid Taheri, Stanley Stepanic, Samuel Perry, Luke Askew, Raúl Adrián Huerta Rodríguez, Ali M. R. Minissi, Ricardo Lorena, Krishnamurthy Iyer, Arshad Anil Fasiludeen, Ronald Clark, Josh Ducey, Matheus Piza, Maja Somrak, Eric Vergo, Juehang Qin, Benjámin Borbás, Eric Chu, Jack Lindsey, Antoine Jallon, I. M. J. McInnis, Evan Chen, Avi Semler, Luk Gloor, Tej Shah, Marc Carauleanu, Pascal Lauer, Tran Đuc Huy, Hossein Shahrtash, Emilien Duc, Lukas Lewark, Assaf Brown, Samuel Albanie, Brian Weber, Warren S. Vaz, Pierre Clavier, Yiyang Fan, Gabriel Poesia Reis e Silva, Long, Lian, Marcus Abramovitch, Xi Jiang, Sandra Mendoza, Murat Islam, Juan Gonzalez, Vasilios Mavroudis, Justin Xu, Pawan Kumar, Laxman Prasad Goswami, Daniel Bugas, Nasser Heydari, Ferenc Jeanplong, Thorben Jansen, Antonella Pinto, Archimedes Apronti, Abdallah Galal, Ng Ze-An, Ankit Singh, Tong Jiang, Joan of Arc Xavier, Kanu Priya Agarwal, Mohammed Berkani, Gang Zhang, Zhehang Du, Benedito Alves de Oliveira Junior, Dmitry Malishev, Nicolas Remy, Taylor D. Hartman, Tim Tarver, Stephen Mensah, Gautier Abou Loume, Wiktor Morak, Farzad Habibi, Sarah Hoback, Will Cai, Javier Gimenez, Roselynn Grace Montecillo, Jakub Łucki, Russell Campbell, Asankhaya Sharma, Khalida Meer, Shreen Gul, Daniel Espinosa Gonzalez, Xavier Alapont, Alex Hoover, Gunjan Chhablani, Freddie Vargus, Arunim Agarwal, Yibo Jiang, Deepakkumar Patil, David Outevsky, Kevin Joseph Scaria, Rajat Maheshwari, Abdelkader Dendane, Priti Shukla, Ashley Cartwright, Sergei Bogdanov, Niels Mündler, Sören Möller, Luca Arnaboldi, Kunvar Thaman, Muhammad Rehan Siddiqi, Prajvi Saxena, Himanshu Gupta, Tony Fruhauff, Glen Sherman, Mátyás Vincze, Siranut Usawasutsakorn, Dylan Ler, Anil Radhakrishnan, Innocent Enyekwe, Sk Md Salauddin, Jiang Muzhen, Aleksandr Maksapetyan, Vivien Rossbach, Chris Harjadi, Mohsen Bahaloohoreh, Claire Sparrow, Jasdeep Sidhu, Sam Ali, Song Bian, John Lai, Eric Singer, Justine Leon Uro, Greg Bateman, Mohamed Sayed, Ahmed Menshawy, Darling Duclosel, Dario Bezzi, Yashaswini Jain, Ashley Aaron, Murat Tiryakioglu, Sheeshram Siddh, Keith Krenek, Imad Ali Shah, Jun Jin, Scott Creighton, Denis Peskoff, Zienab EL-Wasif, Ragavendran P V, Michael Richmond, Joseph McGowan, Tejal Patwardhan, Hao-Yu Sun, Ting Sun, Nikola Zubić, Samuele Sala, Stephen Ebert, Jean Kaddour, Manuel Schottdorf, Dianzhuo Wang, Gerol Petruzella, Alex Meiburg, Tilen Medved, Ali ElSheikh, S Ashwin Hebbar, Lorenzo Vaquero, Xianjun Yang, Jason Poulos, Vilém Zouhar, Sergey Bogdanik, Mingfang Zhang, Jorge Sanz-Ros, David Anugraha, Yinwei Dai, Anh N. Nhu, Xue Wang, Ali Anil Demircali, Zhibai Jia, Yuyin Zhou, Juncheng Wu, Mike He, Nitin Chandok, Aarush Sinha, Gaoxiang Luo, Long Le, Mickaël Noyé, Michał Perełkiewicz, Ioannis Pantidis, Tianbo Qi, Soham Sachin Purohit, Letitia Parcalabescu, Thai-Hoa Nguyen, Genta Indra Winata, Edoardo M. Ponti, Hanchen Li, Kaustubh Dhole, Jongee Park, Dario Abbondanza, Yuanli Wang, Anupam Nayak, Diogo M. Caetano, Antonio A. W. L. Wong, Maria del Rio-Chanona, Dániel Kondor, Pieter Francois, Ed Chalstrey, Jakob Zsambok, Dan Hoyer, Jenny Reddish, Jakob Hauser, Francisco-Javier Rodrigo-Ginés, Suchandra Datta, Maxwell Shepherd, Thom Kamphuis, Qizheng Zhang, Hyunjun Kim, Ruiji Sun, Jianzhu Yao, Franck Dernoncourt, Satyapriya Krishna, Sina Rismanchian, Bonan Pu, Francesco Pinto, Yingheng Wang, Kumar Shridhar, Kalon J. Overholt, Glib Briia, Hieu Nguyen, David, Soler Bartomeu, Tony CY Pang, Adam Wecker, Yifan Xiong, Fanfei Li, Lukas S. Huber, Joshua Jaeger, Romano De Maddalena, Xing Han Lù, Yuhui Zhang, Claas Beger, Patrick Tser Jern Kon, Sean Li, Vivek Sanker, Ming Yin, Yihao Liang, Xinlu Zhang, Ankit Agrawal, Li S. Yifei, Zechen Zhang, Mu Cai, Yasin Sonmez, Costin Cozianu, Changhao Li, Alex Slen, Shoubin Yu, Hyun Kyu Park, Gabriele Sarti, Marcin Briański, Alessandro Stolfo, Truong An Nguyen, Mike Zhang, Yotam Perlitz, Jose Hernandez-Orallo, Runjia Li, Amin Shabani, Felix Juefei-Xu, Shikhar Dhingra, Orr Zohar, My Chiffon Nguyen, Alexander Pondaven, Abdurrahim Yilmaz, Xuandong Zhao, Chuanyang Jin, Muyan Jiang, Stefan Todoran, Xinyao Han, Jules Kreuer, Brian Rabern, Anna Plassart, Martino Maggetti, Luther Yap, Robert Geirhos, Jonathon Kean, Dingsu Wang, Sina Mollaei, Chenkai Sun, Yifan Yin, Shiqi Wang, Rui Li, Yaowen Chang, Anjiang Wei, Alice Bizeul, Xiaohan Wang, Alexandre Oliveira Arrais, Kushin Mukherjee, Jorge Chamorro-Padial, Jiachen Liu, Xingyu Qu, Junyi Guan, Adam Bouyamourn, Shuyu Wu, Martyna Plomecka, Junda Chen, Mengze Tang, Jiaqi Deng, Shreyas Subramanian, Haocheng Xi, Haoxuan Chen, Weizhi Zhang, Yinuo Ren, Haoqin Tu, SeJong Kim, Yushun Chen, Sara Vera Marjanović, Junwoo Ha, Grzegorz Luczyna, Jeff J. Ma, Zewen Shen, Dawn Song, Cedegao E. Zhang, Zhun Wang, Gaël Gendron, Yunze Xiao, Leo Smucker, Erica Weng, Kwok Hao Lee, Zhe Ye, Stefano Ermon, Ignacio D. Lopez-Miguel, Theo Knights, Anthony Gitter, Namkyu Park, Boyi Wei, Hongzheng Chen, Kunal Pai, Ahmed Elkhanany, Han Lin, Philipp D. Siedler, Jichao Fang, Ritwik Mishra, Károly Zsolnai-Fehér, Xilin Jiang, Shadab Khan, Jun Yuan, Rishab Kumar Jain, Xi Lin, Mike Peterson, Zhe Wang, Aditya Malusare, Maosen Tang, Isha Gupta, Ivan Fosin, Timothy Kang, Barbara Dworakowska, Kazuki Matsumoto, Guangyao Zheng, Gerben Sewuster, Jorge Pretel Villanueva, Ivan Rannev, Igor Chernyavsky, Jiale Chen, Deepayan Banik, Ben Racz, Wenchao Dong, Jianxin Wang, Laila Bashmal, Duarte V. Gonçalves, Wei Hu, Kaushik Bar, Ondrej Bohdal, Atharv Singh Patlan, Shehzaad Dhuliawala, Caroline Geirhos, Julien Wist, Yuval Kansal, Bingsen Chen, Kutay Tire, Atak Talay Yücel, Brandon Christof, Veerupaksh Singla, Zijian Song, Sanxing Chen, Jiaxin Ge, Kaustubh Ponkshe, Isaac Park, Tianneng Shi, Martin Q. Ma, Joshua Mak, Sherwin Lai, Antoine Moulin, Zhuo Cheng, Zhanda Zhu, Ziyi Zhang, Vaidehi Patil, Ketan Jha, Qiutong Men, Jiaxuan Wu, Tianchi Zhang, Bruno Hebling Vieira, Alham Fikri Aji, Jae-Won Chung, Mohammed Mahfoud, Ha Thi Hoang, Marc Sperzel, Wei Hao, Kristof Meding, Sihan Xu, Vassilis Kostakos, Davide Manini, Yueying Liu, Christopher Toukmaji, Eunmi Yu, Arif Engin Demircali, Zhiyi Sun, Ivan Dewerpe, Hongsen Qin, Roman Pflugfelder, James Bailey, Johnathan Morris, Ville Heilala, Sybille Rosset, Zishun Yu, Peter E. Chen, Woongyeong Yeo, Eeshaan Jain, Sreekar Chigurupati, Julia Chernyavsky, Sai Prajwal Reddy, Subhashini Venugopalan, Hunar Batra, Core Francisco Park, Hieu Tran, Guilherme Maximiano, Genghan Zhang, Yizhuo Liang, Hu Shiyu, Rongwu Xu, Rui Pan, Siddharth Suresh, Ziqi Liu, Samaksh Gulati, Songyang Zhang, Peter Turchin, Christopher W. Bartlett, Christopher R. Scotese, Phuong M. Cao, Aakaash Nattanmai, Gordon McKellips, Anish Cheraku, Asim Suhail, Marvin Deng, Kavin Jindel, Jay Paek, Kasper Halevy, Allen Baranov, Michael Liu, Advaith Avadhanam, David Zhang, Vincent Cheng, Brad Ma, Evan Fu, Liam Do, Joshua Lass, Surya Sunkari, Vishruth Bharath, Violet Ai, James Leung, Rishit Agrawal, Kevin Chen, Tejas Kalpathi, Ziqi Xu, Gavin Wang, Tyler Xiao, Erik Maung, Sam Lee, Ryan Yang, Roy Yue, Ben Zhao, Julia Yoon, Sunny Sun, Aryan Singh, Ethan Luo, Clark Peng, Tyler Osbey, Taozhi Wang, Daryl Echeazu, Timothy Wu, Spandan Patel, Vidhi Kulkarni, Vijaykaarti Sundarapandiyan, Ashley Zhang, Andrew Le, Zafir Nasim, Srikar Yalam, Ritesh Kasamsetty, Soham Samal, Hubert Yang, David Sun, Nihar Shah, Abhijeet Saha, Alex Zhang, Leon Nguyen, Laasya Nagumalli, Kaixin Wang, Alan Zhou, Aidan Wu, Jason Luo, Anwith Telluri, Summer Yue, Alexandr Wang, Dan Hendrycks

However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities.

Humanity's Last Exam Language Modeling +4

Personalized Preference Fine-tuning of Diffusion Models

no code implementations CVPR 2025 Meihua Dang, Anikait Singh, Linqi Zhou, Stefano Ermon, Jiaming Song

With PPD, a diffusion model learns the individual preferences of a population of users in a few-shot way, enabling generalization to unseen users.

Efficient Scaling of Diffusion Transformers for Text-to-Image Generation

no code implementations16 Dec 2024 Hao Li, Shamit Lal, Zhiheng Li, Yusheng Xie, Ying Wang, Yang Zou, Orchid Majumder, R. Manmatha, Zhuowen Tu, Stefano Ermon, Stefano Soatto, Ashwin Swaminathan

We empirically study the scaling properties of various Diffusion Transformers (DiTs) for text-to-image generation by performing extensive and rigorous ablations, including training scaled DiTs ranging from 0. 3B upto 8B parameters on datasets up to 600M images.

Text to Image Generation Text-to-Image Generation

Non-Myopic Multi-Objective Bayesian Optimization

1 code implementation11 Dec 2024 Syrine Belakaria, Alaleh Ahmadianshalchi, Barbara Engelhardt, Stefano Ermon, Janardhan Rao Doppa

We address this challenge by using hypervolume improvement (HVI) as our scalarization approach, which allows us to use a lower-bound on the Bellman equation to approximate the finite-horizon using a batch expected hypervolume improvement (EHVI) acquisition function (AF) for MOO.

Bayesian Optimization Experimental Design

Self-Refining Diffusion Samplers: Enabling Parallelization via Parareal Iterations

1 code implementation11 Dec 2024 Nikil Roashan Selvam, Amil Merchant, Stefano Ermon

In diffusion models, samples are generated through an iterative refinement process, requiring hundreds of sequential model evaluations.

Convolutional Differentiable Logic Gate Networks

1 code implementation7 Nov 2024 Felix Petersen, Hilde Kuehne, Christian Borgelt, Julian Welzel, Stefano Ermon

With the increasing inference cost of machine learning models, there is a growing interest in models with fast and efficient inference.

TrAct: Making First-layer Pre-Activations Trainable

no code implementations31 Oct 2024 Felix Petersen, Christian Borgelt, Stefano Ermon

Thus, we propose the conceptual procedure of (i) a gradient descent step on first layer activations to construct an activation proposal, and (ii) finding the optimal weights of the first layer, i. e., those weights which minimize the squared distance to the activation proposal.

$f$-PO: Generalizing Preference Optimization with $f$-divergence Minimization

1 code implementation29 Oct 2024 Jiaqi Han, Mingjian Jiang, Yuxuan Song, Jure Leskovec, Stefano Ermon, Minkai Xu

Preference optimization has made significant progress recently, with numerous methods developed to align language models with human preferences.

Language Modeling Language Modelling

Energy-Based Diffusion Language Models for Text Generation

no code implementations28 Oct 2024 Minkai Xu, Tomas Geffner, Karsten Kreis, Weili Nie, Yilun Xu, Jure Leskovec, Stefano Ermon, Arash Vahdat

Unfortunately, these models still underperform the autoregressive counterparts, with the performance gap increasing when reducing the number of sampling steps.

Language Modeling Language Modelling +1

TabDiff: a Multi-Modal Diffusion Model for Tabular Data Generation

1 code implementation27 Oct 2024 Juntong Shi, Minkai Xu, Harper Hua, Hengrui Zhang, Stefano Ermon, Jure Leskovec

In this paper, we introduce TabDiff, a joint diffusion framework that models all multi-modal distributions of tabular data in one model.

Imputation Tabular Data Generation

Newton Losses: Using Curvature Information for Learning with Differentiable Algorithms

1 code implementation24 Oct 2024 Felix Petersen, Christian Borgelt, Tobias Sutter, Hilde Kuehne, Oliver Deussen, Stefano Ermon

Instead of training the neural network with second-order techniques, we only utilize the loss function's second-order information to replace it by a Newton Loss, while training the network with gradient descent.

Improving Vector-Quantized Image Modeling with Latent Consistency-Matching Diffusion

no code implementations18 Oct 2024 Bac Nguyen, Chieh-Hsin Lai, Yuhta Takida, Naoki Murata, Toshimitsu Uesaka, Stefano Ermon, Yuki Mitsufuji

By embedding discrete representations into a continuous latent space, we can leverage continuous-space latent diffusion models to handle generative modeling of discrete data.

Conditional Image Generation Machine Translation +1

Geometric Trajectory Diffusion Models

1 code implementation16 Oct 2024 Jiaqi Han, Minkai Xu, Aaron Lou, Haotian Ye, Stefano Ermon

In this work, we propose geometric trajectory diffusion models (GeoTDM), the first diffusion model for modeling the temporal distribution of 3D geometric trajectories.

Protein Design

Generalizing Stochastic Smoothing for Differentiation and Gradient Estimation

no code implementations10 Oct 2024 Felix Petersen, Christian Borgelt, Aashwin Mishra, Stefano Ermon

We deal with the problem of gradient estimation for stochastic differentiable relaxations of algorithms, operators, simulators, and other non-differentiable functions.

Pose Estimation

Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis

1 code implementation9 Oct 2024 Bohan Zeng, Ling Yang, Siyu Li, Jiaming Liu, Zixiang Zhang, Juanxi Tian, Kaixin Zhu, Yongzhen Guo, Fu-Yun Wang, Minkai Xu, Stefano Ermon, Wentao Zhang

Then we propose a geometry-aware 4D transition network to realize a complex scene-level 4D transition based on the plan, which involves expressive geometrical object deformation.

Video Generation

G2D2: Gradient-guided Discrete Diffusion for image inverse problem solving

no code implementations9 Oct 2024 Naoki Murata, Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Bac Nguyen, Stefano Ermon, Yuki Mitsufuji

Recent literature has effectively utilized diffusion models trained on continuous variables as priors for solving inverse problems.

Image Generation Motion Generation

Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation

no code implementations3 Oct 2024 Rohin Manvi, Anikait Singh, Stefano Ermon

We further demonstrate that 50-75% of samples can be pruned early in generation with minimal degradation in performance.

GSM8K Math

Calibrated Probabilistic Forecasts for Arbitrary Sequences

no code implementations27 Sep 2024 Charles Marx, Volodymyr Kuleshov, Stefano Ermon

Real-world data streams can change unpredictably due to distribution shifts, feedback loops and adversarial actors, which challenges the validity of forecasts.

Decision Making valid

TFG: Unified Training-Free Guidance for Diffusion Models

1 code implementation24 Sep 2024 Haotian Ye, Haowei Lin, Jiaqi Han, Minkai Xu, Sheng Liu, Yitao Liang, Jianzhu Ma, James Zou, Stefano Ermon

Given an unconditional diffusion model and a predictor for a target property of interest (e. g., a classifier), the goal of training-free guidance is to generate samples with desirable target properties without additional training.

CPSample: Classifier Protected Sampling for Guarding Training Data During Diffusion

no code implementations11 Sep 2024 Joshua Kazdan, Hao Sun, Jiaqi Han, Felix Petersen, Stefano Ermon

Diffusion models have a tendency to exactly replicate their training data, especially when trained on small datasets.

Unsupervised Anomaly Detection Using Diffusion Trend Analysis

no code implementations12 Jul 2024 Eunwoo Kim, Un Yang, Cheol Lae Roh, Stefano Ermon

Conventional anomaly detection techniques based on reconstruction via denoising diffusion model are widely used due to their ability to identify anomaly locations and shapes with high performance.

Denoising Unsupervised Anomaly Detection

Consistency Flow Matching: Defining Straight Flows with Velocity Consistency

1 code implementation2 Jul 2024 Ling Yang, Zixiang Zhang, Zhilong Zhang, Xingchao Liu, Minkai Xu, Wentao Zhang, Chenlin Meng, Stefano Ermon, Bin Cui

Additionally, we propose a multi-segment training approach for Consistency-FM to enhance expressiveness, achieving a better trade-off between sampling quality and speed.

Image Generation

Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization

1 code implementation1 Jul 2024 Siyi Gu, Minkai Xu, Alexander Powers, Weili Nie, Tomas Geffner, Karsten Kreis, Jure Leskovec, Arash Vahdat, Stefano Ermon

AliDiff shifts the target-conditioned chemical distribution towards regions with higher binding affinity and structural rationality, specified by user-defined reward functions, via the preference optimization approach.

Avg Drug Design

Changen2: Multi-Temporal Remote Sensing Generative Change Foundation Model

1 code implementation26 Jun 2024 Zhuo Zheng, Stefano Ermon, Dongjun Kim, Liangpei Zhang, Yanfei Zhong

Changen2 is a generative change foundation model that can be trained at scale via self-supervision, and can produce change supervisory signals from unlabeled single-temporal images.

Change Detection Time Series

TorchSpatial: A Location Encoding Framework and Benchmark for Spatial Representation Learning

2 code implementations21 Jun 2024 Nemin Wu, Qian Cao, Zhangyu Wang, Zeping Liu, Yanlin Qi, Jielu Zhang, Joshua Ni, Xiaobai Yao, Hongxu Ma, Lan Mu, Stefano Ermon, Tanuja Ganu, Akshay Nambi, Ni Lao, Gengchen Mai

To fill this gap, we propose TorchSpatial, a learning framework and benchmark for location (point) encoding, which is one of the most fundamental data types of spatial representation learning.

Fairness Geographic Question Answering +5

ExPLoRA: Parameter-Efficient Extended Pre-Training to Adapt Vision Transformers under Domain Shifts

no code implementations16 Jun 2024 Samar Khanna, Medhanie Irgau, David B. Lobell, Stefano Ermon

An under-explored question of PEFT is in extending the pre-training phase without supervised labels; that is, can we adapt a pre-trained foundation model to a new domain via efficient self-supervised pre-training on this new domain?

parameter-efficient fine-tuning Transfer Learning +1

State-Free Inference of State-Space Models: The Transfer Function Approach

1 code implementation10 May 2024 Rom N. Parnichkun, Stefano Massaroli, Alessandro Moro, Jimmy T. H. Smith, Ramin Hasani, Mathias Lechner, Qi An, Christopher Ré, Hajime Asama, Stefano Ermon, Taiji Suzuki, Atsushi Yamashita, Michael Poli

We approach designing a state-space model for deep learning applications through its dual representation, the transfer function, and uncover a highly efficient sequence parallel inference algorithm that is state-free: unlike other proposed algorithms, state-free inference does not incur any significant memory or computational cost with an increase in state size.

Language Modeling Language Modelling +1

Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data

1 code implementation22 Apr 2024 Fahim Tajwar, Anikait Singh, Archit Sharma, Rafael Rafailov, Jeff Schneider, Tengyang Xie, Stefano Ermon, Chelsea Finn, Aviral Kumar

Our main finding is that, in general, approaches that use on-policy sampling or attempt to push down the likelihood on certain responses (i. e., employ a "negative gradient") outperform offline and maximum likelihood objectives.

Contrastive Learning Reinforcement Learning (RL)

Disentangling Length from Quality in Direct Preference Optimization

1 code implementation28 Mar 2024 Ryan Park, Rafael Rafailov, Stefano Ermon, Chelsea Finn

A number of approaches have been developed to control those biases in the classical RLHF literature, but the problem remains relatively under-explored for Direct Alignment Algorithms such as Direct Preference Optimization (DPO).

reinforcement-learning Reinforcement Learning

Mechanistic Design and Scaling of Hybrid Architectures

1 code implementation26 Mar 2024 Michael Poli, Armin W Thomas, Eric Nguyen, Pragaash Ponnusamy, Björn Deiseroth, Kristian Kersting, Taiji Suzuki, Brian Hie, Stefano Ermon, Christopher Ré, Ce Zhang, Stefano Massaroli

The development of deep learning architectures is a resource-demanding process, due to a vast design space, long prototyping times, and high compute costs associated with at-scale model training and evaluation.

Mamba

Contextualized Diffusion Models for Text-Guided Image and Video Generation

1 code implementation26 Feb 2024 Ling Yang, Zhilong Zhang, Zhaochen Yu, Jingwei Liu, Minkai Xu, Stefano Ermon, Bin Cui

To address this issue, we propose a novel and general contextualized diffusion model (ContextDiff) by incorporating the cross-modal context encompassing interactions and alignments between text condition and visual sample into forward and reverse processes.

Text to Image Generation Text-to-Image Generation +3

Uncertainty Quantification for Forward and Inverse Problems of PDEs via Latent Global Evolution

2 code implementations13 Feb 2024 Tailin Wu, Willie Neiswanger, Hongtao Zheng, Stefano Ermon, Jure Leskovec

Deep learning-based surrogate models have demonstrated remarkable advantages over classical solvers in terms of speed, often achieving speedups of 10 to 1000 times over traditional partial differential equation (PDE) solvers.

Decision Making Deep Learning +1

Large Language Models are Geographically Biased

1 code implementation5 Feb 2024 Rohin Manvi, Samar Khanna, Marshall Burke, David Lobell, Stefano Ermon

Initially, we demonstrate that LLMs are capable of making accurate zero-shot geospatial predictions in the form of ratings that show strong monotonic correlation with ground truth (Spearman's $\rho$ of up to 0. 89).

Fairness

Segment Any Change

1 code implementation2 Feb 2024 Zhuo Zheng, Yanfei Zhong, Liangpei Zhang, Stefano Ermon

Visual foundation models have achieved remarkable results in zero-shot image classification and segmentation, but zero-shot change detection remains an open problem.

Change Detection image-classification +2

Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs

1 code implementation22 Jan 2024 Ling Yang, Zhaochen Yu, Chenlin Meng, Minkai Xu, Stefano Ermon, Bin Cui

In this paper, we propose a brand new training-free text-to-image generation/editing framework, namely Recaption, Plan and Generate (RPG), harnessing the powerful chain-of-thought reasoning ability of multimodal LLMs to enhance the compositionality of text-to-image diffusion models.

Diffusion Personalization Tuning Free Large Language Model +1

Equivariant Graph Neural Operator for Modeling 3D Dynamics

1 code implementation19 Jan 2024 Minkai Xu, Jiaqi Han, Aaron Lou, Jean Kossaifi, Arvind Ramanathan, Kamyar Azizzadenesheli, Jure Leskovec, Stefano Ermon, Anima Anandkumar

Comprehensive experiments in multiple domains, including particle simulations, human motion capture, and molecular dynamics, demonstrate the significantly superior performance of EGNO against existing methods, thanks to the equivariant temporal modeling.

Operator learning

Equivariant Flow Matching with Hybrid Probability Transport

1 code implementation12 Dec 2023 Yuxuan Song, Jingjing Gong, Minkai Xu, Ziyao Cao, Yanyan Lan, Stefano Ermon, Hao Zhou, Wei-Ying Ma

The generation of 3D molecules requires simultaneously deciding the categorical features~(atom types) and continuous features~(atom coordinates).

DiffusionSat: A Generative Foundation Model for Satellite Imagery

1 code implementation6 Dec 2023 Samar Khanna, Patrick Liu, Linqi Zhou, Chenlin Meng, Robin Rombach, Marshall Burke, David Lobell, Stefano Ermon

Our method outperforms previous state-of-the-art methods for satellite image generation and is the first large-scale generative foundation model for satellite imagery.

Crop Yield Prediction Image Generation +1

Manifold Preserving Guided Diffusion

no code implementations28 Nov 2023 Yutong He, Naoki Murata, Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Dongjun Kim, Wei-Hsiang Liao, Yuki Mitsufuji, J. Zico Kolter, Ruslan Salakhutdinov, Stefano Ermon

Despite the recent advancements, conditional image generation still faces challenges of cost, generalizability, and the need for task-specific training.

Conditional Image Generation

DreamPropeller: Supercharge Text-to-3D Generation with Parallel Sampling

1 code implementation CVPR 2024 Linqi Zhou, Andy Shih, Chenlin Meng, Stefano Ermon

Recent methods such as Score Distillation Sampling (SDS) and Variational Score Distillation (VSD) using 2D diffusion models for text-to-3D generation have demonstrated impressive generation quality.

3D Generation Text to 3D

Diffusion Model Alignment Using Direct Preference Optimization

2 code implementations CVPR 2024 Bram Wallace, Meihua Dang, Rafael Rafailov, Linqi Zhou, Aaron Lou, Senthil Purushwalkam, Stefano Ermon, Caiming Xiong, Shafiq Joty, Nikhil Naik

Large language models (LLMs) are fine-tuned using human comparison data with Reinforcement Learning from Human Feedback (RLHF) methods to make them better aligned with users' preferences.

model Text-to-Image Generation

Calibration by Distribution Matching: Trainable Kernel Calibration Metrics

1 code implementation NeurIPS 2023 Charles Marx, Sofian Zalouk, Stefano Ermon

Calibration ensures that probabilistic forecasts meaningfully capture uncertainty by requiring that predicted probabilities align with empirical frequencies.

Decision Making regression

Generative Fractional Diffusion Models

1 code implementation26 Oct 2023 Gabriel Nobis, Maximilian Springenberg, Marco Aversa, Michael Detzel, Rembert Daems, Roderick Murray-Smith, Shinichi Nakajima, Sebastian Lapuschkin, Stefano Ermon, Tolga Birdal, Manfred Opper, Christoph Knochenhauer, Luis Oala, Wojciech Samek

To ensure tractable inference and learning, we employ a recently popularized Markov approximation of fBM (MA-fBM) and derive its reverse-time model, resulting in generative fractional diffusion models (GFDM).

Diversity

Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution

4 code implementations25 Oct 2023 Aaron Lou, Chenlin Meng, Stefano Ermon

Experimentally, we test our Score Entropy Discrete Diffusion models (SEDD) on standard language modeling tasks.

Denoising Language Modeling +1

GeoLLM: Extracting Geospatial Knowledge from Large Language Models

1 code implementation10 Oct 2023 Rohin Manvi, Samar Khanna, Gengchen Mai, Marshall Burke, David Lobell, Stefano Ermon

With GeoLLM, we observe that GPT-3. 5 outperforms Llama 2 and RoBERTa by 19% and 51% respectively, suggesting that the performance of our method scales well with the size of the model and its pretraining dataset.

The Role of Linguistic Priors in Measuring Compositional Generalization of Vision-Language Models

no code implementations4 Oct 2023 Chenwei Wu, Li Erran Li, Stefano Ermon, Patrick Haffner, Rong Ge, Zaiwei Zhang

Compositionality is a common property in many modalities including natural languages and images, but the compositional generalization of multi-modal models is not well-understood.

Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion

2 code implementations1 Oct 2023 Dongjun Kim, Chieh-Hsin Lai, Wei-Hsiang Liao, Naoki Murata, Yuhta Takida, Toshimitsu Uesaka, Yutong He, Yuki Mitsufuji, Stefano Ermon

Consistency Models (CM) (Song et al., 2023) accelerate score-based diffusion model sampling at the cost of sample quality but lack a natural way to trade-off quality for speed.

 Ranked #1 on Image Generation on ImageNet 64x64 (NFE metric)

Denoising Image Generation

SSIF: Learning Continuous Image Representation for Spatial-Spectral Super-Resolution

no code implementations30 Sep 2023 Gengchen Mai, Ni Lao, Weiwei Sun, Yuchi Ma, Jiaming Song, Chenlin Meng, Hongxu Ma, Jinmeng Rao, Ziyuan Li, Stefano Ermon

Existing digital sensors capture images at fixed spatial and spectral resolutions (e. g., RGB, multispectral, and hyperspectral images), and each combination requires bespoke machine learning models.

Spectral Super-Resolution Super-Resolution

Denoising Diffusion Bridge Models

4 code implementations29 Sep 2023 Linqi Zhou, Aaron Lou, Samar Khanna, Stefano Ermon

However, for many applications such as image editing, the model input comes from a distribution that is not random noise.

Denoising Image Generation

Sphere2Vec: A General-Purpose Location Representation Learning over a Spherical Surface for Large-Scale Geospatial Predictions

1 code implementation30 Jun 2023 Gengchen Mai, Yao Xuan, Wenyun Zuo, Yutong He, Jiaming Song, Stefano Ermon, Krzysztof Janowicz, Ni Lao

So when applied to large-scale real-world GPS coordinate datasets, which require distance metric learning on the spherical surface, both types of models can fail due to the map projection distortion problem (2D) and the spherical-to-Euclidean distance approximation error (3D).

image-classification Image Classification +4

HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution

4 code implementations NeurIPS 2023 Eric Nguyen, Michael Poli, Marjan Faizi, Armin Thomas, Callum Birch-Sykes, Michael Wornow, Aman Patel, Clayton Rabideau, Stefano Massaroli, Yoshua Bengio, Stefano Ermon, Stephen A. Baccus, Chris Ré

Leveraging Hyena's new long-range capabilities, we present HyenaDNA, a genomic foundation model pretrained on the human reference genome with context lengths of up to 1 million tokens at the single nucleotide-level - an up to 500x increase over previous dense attention-based models.

4k In-Context Learning +2

SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking

no code implementations8 Jun 2023 Chris Cundy, Stefano Ermon

This allows us to minimize a variety of divergences between the distribution of sequences generated by an autoregressive model and sequences from a dataset, including divergences with weight on OOD generated sequences.

Imitation Learning Text Generation

GEO-Bench: Toward Foundation Models for Earth Monitoring

1 code implementation NeurIPS 2023 Alexandre Lacoste, Nils Lehmann, Pau Rodriguez, Evan David Sherwin, Hannah Kerner, Björn Lütjens, Jeremy Andrew Irvin, David Dao, Hamed Alemohammad, Alexandre Drouin, Mehmet Gunturkun, Gabriel Huang, David Vazquez, Dava Newman, Yoshua Bengio, Stefano Ermon, Xiao Xiang Zhu

Recent progress in self-supervision has shown that pre-training large neural networks on vast amounts of unsupervised data can lead to substantial increases in generalization to downstream tasks.

On the Equivalence of Consistency-Type Models: Consistency Models, Consistent Diffusion Models, and Fokker-Planck Regularization

no code implementations1 Jun 2023 Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Naoki Murata, Yuki Mitsufuji, Stefano Ermon

The emergence of various notions of ``consistency'' in diffusion models has garnered considerable attention and helped achieve improved sample quality, likelihood estimation, and accelerated sampling.

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

29 code implementations NeurIPS 2023 Rafael Rafailov, Archit Sharma, Eric Mitchell, Stefano Ermon, Christopher D. Manning, Chelsea Finn

Existing methods for gaining such steerability collect human labels of the relative quality of model generations and fine-tune the unsupervised LM to align with these preferences, often with reinforcement learning from human feedback (RLHF).

Language Modeling Language Modelling +4

MADiff: Offline Multi-agent Learning with Diffusion Models

1 code implementation27 May 2023 Zhengbang Zhu, Minghuan Liu, Liyuan Mao, Bingyi Kang, Minkai Xu, Yong Yu, Stefano Ermon, Weinan Zhang

Offline reinforcement learning (RL) aims to learn policies from pre-existing datasets without further interactions, making it a challenging task.

Offline RL Q-Learning +2

Parallel Sampling of Diffusion Models

1 code implementation NeurIPS 2023 Andy Shih, Suneel Belkhale, Stefano Ermon, Dorsa Sadigh, Nima Anari

Instead of reducing the number of denoising steps (trading quality for speed), in this paper we explore an orthogonal approach: can we run the denoising steps in parallel (trading compute for speed)?

Denoising Image Generation

Geometric Latent Diffusion Models for 3D Molecule Generation

2 code implementations2 May 2023 Minkai Xu, Alexander Powers, Ron Dror, Stefano Ermon, Jure Leskovec

Generative models, especially diffusion models (DMs), have achieved promising results for generating feature-rich geometries and advancing foundational science problems such as molecule design.

3D Molecule Generation Unconditional Molecule Generation +1

CSP: Self-Supervised Contrastive Spatial Pre-Training for Geospatial-Visual Representations

2 code implementations1 May 2023 Gengchen Mai, Ni Lao, Yutong He, Jiaming Song, Stefano Ermon

To directly leverage the abundant geospatial information associated with images in pre-training, fine-tuning, and inference stages, we present Contrastive Spatial Pre-Training (CSP), a self-supervised learning framework for geo-tagged images.

Contrastive Learning image-classification +2

MUDiff: Unified Diffusion for Complete Molecule Generation

no code implementations28 Apr 2023 Chenqing Hua, Sitao Luan, Minkai Xu, Rex Ying, Jie Fu, Stefano Ermon, Doina Precup

Our model is a promising approach for designing stable and diverse molecules and can be applied to a wide range of tasks in molecular modeling.

3D geometry Drug Discovery +1

MERMAIDE: Learning to Align Learners using Model-Based Meta-Learning

no code implementations10 Apr 2023 Arundhati Banerjee, Soham Phade, Stefano Ermon, Stephan Zheng

We then show that our model-based meta-learning approach is cost-effective in intervening on bandit agents with unseen explore-exploit strategies.

Meta-Learning

Reflected Diffusion Models

1 code implementation10 Apr 2023 Aaron Lou, Stefano Ermon

To incorporate data constraints in a principled manner, we present Reflected Diffusion Models, which instead reverse a reflected stochastic differential equation evolving on the support of the data.

Ranked #8 on Image Generation on ImageNet 32x32 (bpd metric)

Image Generation

Ideal Abstractions for Decision-Focused Learning

no code implementations29 Mar 2023 Michael Poli, Stefano Massaroli, Stefano Ermon, Bryan Wilder, Eric Horvitz

We present a methodology for formulating simplifying abstractions in machine learning systems by identifying and harnessing the utility structure of decisions.

Decision Making Management

End-to-End Diffusion Latent Optimization Improves Classifier Guidance

1 code implementation ICCV 2023 Bram Wallace, Akash Gokul, Stefano Ermon, Nikhil Naik

Classifier guidance -- using the gradients of an image classifier to steer the generations of a diffusion model -- has the potential to dramatically expand the creative control over image generation and editing.

Denoising Image Generation

GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation

1 code implementation ICCV 2023 Can Qin, Ning Yu, Chen Xing, Shu Zhang, Zeyuan Chen, Stefano Ermon, Yun Fu, Caiming Xiong, ran Xu

Empirical results show that GlueNet can be trained efficiently and enables various capabilities beyond previous state-of-the-art models: 1) multilingual language models such as XLM-Roberta can be aligned with existing T2I models, allowing for the generation of high-quality images from captions beyond English; 2) GlueNet can align multi-modal encoders such as AudioCLIP with the Stable Diffusion model, enabling sound-to-image generation; 3) it can also upgrade the current text encoder of the latent diffusion model for challenging case generation.

Decoder Image Generation

Offline Imitation Learning with Suboptimal Demonstrations via Relaxed Distribution Matching

no code implementations5 Mar 2023 Lantao Yu, Tianhe Yu, Jiaming Song, Willie Neiswanger, Stefano Ermon

In this case, a well-known issue is the distribution shift between the learned policy and the behavior policy that collects the offline data.

continuous-control Continuous Control +1

Hyena Hierarchy: Towards Larger Convolutional Language Models

7 code implementations21 Feb 2023 Michael Poli, Stefano Massaroli, Eric Nguyen, Daniel Y. Fu, Tri Dao, Stephen Baccus, Yoshua Bengio, Stefano Ermon, Christopher Ré

Recent advances in deep learning have relied heavily on the use of large Transformers due to their ability to learn at scale.

2k 8k +3

Long Horizon Temperature Scaling

1 code implementation7 Feb 2023 Andy Shih, Dorsa Sadigh, Stefano Ermon

LHTS is compatible with all likelihood-based models, and optimizes for the long horizon likelihood of samples.

Multiple-choice

GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration

1 code implementation30 Jan 2023 Naoki Murata, Koichi Saito, Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Yuki Mitsufuji, Stefano Ermon

Pre-trained diffusion models have been successfully used as priors in a variety of linear inverse problems, where the goal is to reconstruct a signal from noisy linear measurements.

Denoising Image Deblurring

Extreme Q-Learning: MaxEnt RL without Entropy

4 code implementations5 Jan 2023 Divyansh Garg, Joey Hejna, Matthieu Geist, Stefano Ermon

Using EVT, we derive our \emph{Extreme Q-Learning} framework and consequently online and, for the first time, offline MaxEnt Q-learning algorithms, that do not explicitly require access to a policy or its entropy.

D4RL Deep Reinforcement Learning +3

Building Coverage Estimation with Low-resolution Remote Sensing Imagery

no code implementations4 Jan 2023 Enci Liu, Chenlin Meng, Matthew Kolodner, Eun Jee Sung, Sihang Chen, Marshall Burke, David Lobell, Stefano Ermon

In this paper, we propose a method for estimating building coverage using only publicly available low-resolution satellite imagery that is more frequently updated.

quantile regression

Deep Latent State Space Models for Time-Series Generation

1 code implementation24 Dec 2022 Linqi Zhou, Michael Poli, Winnie Xu, Stefano Massaroli, Stefano Ermon

Methods based on ordinary differential equations (ODEs) are widely used to build generative models of time-series.

State Space Models Time Series +2

Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models

1 code implementation3 Nov 2022 Muyang Li, Ji Lin, Chenlin Meng, Stefano Ermon, Song Han, Jun-Yan Zhu

With about $1\%$-area edits, SIGE accelerates DDPM by $3. 0\times$ on NVIDIA RTX 3090 and $4. 6\times$ on Apple M1 Pro GPU, Stable Diffusion by $7. 2\times$ on 3090, and GauGAN by $5. 6\times$ on 3090 and $5. 2\times$ on M1 Pro GPU.

GPU

Concrete Score Matching: Generalized Score Matching for Discrete Data

no code implementations2 Nov 2022 Chenlin Meng, Kristy Choi, Jiaming Song, Stefano Ermon

To this end, we propose an analogous score function called the "Concrete score", a generalization of the (Stein) score for discrete settings.

Density Estimation

LMPriors: Pre-Trained Language Models as Task-Specific Priors

no code implementations22 Oct 2022 Kristy Choi, Chris Cundy, Sanjari Srivastava, Stefano Ermon

Particularly in low-data regimes, an outstanding challenge in machine learning is developing principled techniques for augmenting our models with suitable priors.

Causal Inference Common Sense Reasoning +4

FP-Diffusion: Improving Score-based Diffusion Models by Enforcing the Underlying Score Fokker-Planck Equation

1 code implementation9 Oct 2022 Chieh-Hsin Lai, Yuhta Takida, Naoki Murata, Toshimitsu Uesaka, Yuki Mitsufuji, Stefano Ermon

Score-based generative models (SGMs) learn a family of noise-conditional score functions corresponding to the data density perturbed with increasingly large amounts of noise.

Denoising

Exploration via Planning for Information about the Optimal Trajectory

2 code implementations6 Oct 2022 Viraj Mehta, Ian Char, Joseph Abbate, Rory Conlin, Mark D. Boyer, Stefano Ermon, Jeff Schneider, Willie Neiswanger

In this work, we develop a method that allows us to plan for exploration while taking both the task and the current knowledge about the dynamics into account.

Reinforcement Learning (RL)

On Distillation of Guided Diffusion Models

2 code implementations CVPR 2023 Chenlin Meng, Robin Rombach, Ruiqi Gao, Diederik P. Kingma, Stefano Ermon, Jonathan Ho, Tim Salimans

For standard diffusion models trained on the pixel-space, our approach is able to generate images visually comparable to that of the original model using as few as 4 sampling steps on ImageNet 64x64 and CIFAR-10, achieving FID/IS scores comparable to that of the original model while being up to 256 times faster to sample from.

Denoising Image Generation +1

Generalizing Bayesian Optimization with Decision-theoretic Entropies

no code implementations4 Oct 2022 Willie Neiswanger, Lantao Yu, Shengjia Zhao, Chenlin Meng, Stefano Ermon

Bayesian optimization (BO) is a popular method for efficiently inferring optima of an expensive black-box function via a sequence of queries.

Bayesian Optimization Decision Making +1

Towards General-Purpose Representation Learning of Polygonal Geometries

1 code implementation29 Sep 2022 Gengchen Mai, Chiyu Jiang, Weiwei Sun, Rui Zhu, Yao Xuan, Ling Cai, Krzysztof Janowicz, Stefano Ermon, Ni Lao

For the spatial domain approach, we propose ResNet1D, a 1D CNN-based polygon encoder, which uses circular padding to achieve loop origin invariance on simple polygons.

Relation Prediction Representation Learning

ButterflyFlow: Building Invertible Layers with Butterfly Matrices

no code implementations28 Sep 2022 Chenlin Meng, Linqi Zhou, Kristy Choi, Tri Dao, Stefano Ermon

Normalizing flows model complex probability distributions using maps obtained by composing invertible layers.

Density Estimation

Multipoint-BAX: A New Approach for Efficiently Tuning Particle Accelerator Emittance via Virtual Objectives

no code implementations10 Sep 2022 Sara A. Miskovich, Willie Neiswanger, William Colocho, Claudio Emma, Jacqueline Garrahan, Timothy Maxwell, Christopher Mayes, Stefano Ermon, Auralee Edelen, Daniel Ratner

Traditional black-box optimizers such as Bayesian optimization are slow and inefficient when dealing with such objectives as they must acquire the full series of measurements, but return only the emittance, with each query.

Bayesian Optimization

A General Recipe for Likelihood-free Bayesian Optimization

1 code implementation27 Jun 2022 Jiaming Song, Lantao Yu, Willie Neiswanger, Stefano Ermon

To extend BO to a broader class of models and utilities, we propose likelihood-free BO (LFBO), an approach based on likelihood-free inference.

Bayesian Optimization

Modular Conformal Calibration

no code implementations23 Jun 2022 Charles Marx, Shengjia Zhao, Willie Neiswanger, Stefano Ermon

We introduce a versatile class of algorithms for recalibration in regression that we call Modular Conformal Calibration (MCC).

regression

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

6 code implementations9 Jun 2022 Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza, Ambrose Slone, Ameet Rahane, Anantharaman S. Iyer, Anders Andreassen, Andrea Madotto, Andrea Santilli, Andreas Stuhlmüller, Andrew Dai, Andrew La, Andrew Lampinen, Andy Zou, Angela Jiang, Angelica Chen, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Ashish Sabharwal, Austin Herrick, Avia Efrat, Aykut Erdem, Ayla Karakaş, B. Ryan Roberts, Bao Sheng Loe, Barret Zoph, Bartłomiej Bojanowski, Batuhan Özyurt, Behnam Hedayatnia, Behnam Neyshabur, Benjamin Inden, Benno Stein, Berk Ekmekci, Bill Yuchen Lin, Blake Howald, Bryan Orinion, Cameron Diao, Cameron Dour, Catherine Stinson, Cedrick Argueta, César Ferri Ramírez, Chandan Singh, Charles Rathkopf, Chenlin Meng, Chitta Baral, Chiyu Wu, Chris Callison-Burch, Chris Waites, Christian Voigt, Christopher D. Manning, Christopher Potts, Cindy Ramirez, Clara E. Rivera, Clemencia Siro, Colin Raffel, Courtney Ashcraft, Cristina Garbacea, Damien Sileo, Dan Garrette, Dan Hendrycks, Dan Kilman, Dan Roth, Daniel Freeman, Daniel Khashabi, Daniel Levy, Daniel Moseguí González, Danielle Perszyk, Danny Hernandez, Danqi Chen, Daphne Ippolito, Dar Gilboa, David Dohan, David Drakard, David Jurgens, Debajyoti Datta, Deep Ganguli, Denis Emelin, Denis Kleyko, Deniz Yuret, Derek Chen, Derek Tam, Dieuwke Hupkes, Diganta Misra, Dilyar Buzan, Dimitri Coelho Mollo, Diyi Yang, Dong-Ho Lee, Dylan Schrader, Ekaterina Shutova, Ekin Dogus Cubuk, Elad Segal, Eleanor Hagerman, Elizabeth Barnes, Elizabeth Donoway, Ellie Pavlick, Emanuele Rodola, Emma Lam, Eric Chu, Eric Tang, Erkut Erdem, Ernie Chang, Ethan A. Chi, Ethan Dyer, Ethan Jerzak, Ethan Kim, Eunice Engefu Manyasi, Evgenii Zheltonozhskii, Fanyue Xia, Fatemeh Siar, Fernando Martínez-Plumed, Francesca Happé, Francois Chollet, Frieda Rong, Gaurav Mishra, Genta Indra Winata, Gerard de Melo, Germán Kruszewski, Giambattista Parascandolo, Giorgio Mariani, Gloria Wang, Gonzalo Jaimovitch-López, Gregor Betz, Guy Gur-Ari, Hana Galijasevic, Hannah Kim, Hannah Rashkin, Hannaneh Hajishirzi, Harsh Mehta, Hayden Bogar, Henry Shevlin, Hinrich Schütze, Hiromu Yakura, Hongming Zhang, Hugh Mee Wong, Ian Ng, Isaac Noble, Jaap Jumelet, Jack Geissinger, Jackson Kernion, Jacob Hilton, Jaehoon Lee, Jaime Fernández Fisac, James B. Simon, James Koppel, James Zheng, James Zou, Jan Kocoń, Jana Thompson, Janelle Wingfield, Jared Kaplan, Jarema Radom, Jascha Sohl-Dickstein, Jason Phang, Jason Wei, Jason Yosinski, Jekaterina Novikova, Jelle Bosscher, Jennifer Marsh, Jeremy Kim, Jeroen Taal, Jesse Engel, Jesujoba Alabi, Jiacheng Xu, Jiaming Song, Jillian Tang, Joan Waweru, John Burden, John Miller, John U. Balis, Jonathan Batchelder, Jonathan Berant, Jörg Frohberg, Jos Rozen, Jose Hernandez-Orallo, Joseph Boudeman, Joseph Guerr, Joseph Jones, Joshua B. Tenenbaum, Joshua S. Rule, Joyce Chua, Kamil Kanclerz, Karen Livescu, Karl Krauth, Karthik Gopalakrishnan, Katerina Ignatyeva, Katja Markert, Kaustubh D. Dhole, Kevin Gimpel, Kevin Omondi, Kory Mathewson, Kristen Chiafullo, Ksenia Shkaruta, Kumar Shridhar, Kyle McDonell, Kyle Richardson, Laria Reynolds, Leo Gao, Li Zhang, Liam Dugan, Lianhui Qin, Lidia Contreras-Ochando, Louis-Philippe Morency, Luca Moschella, Lucas Lam, Lucy Noble, Ludwig Schmidt, Luheng He, Luis Oliveros Colón, Luke Metz, Lütfi Kerem Şenel, Maarten Bosma, Maarten Sap, Maartje ter Hoeve, Maheen Farooqi, Manaal Faruqui, Mantas Mazeika, Marco Baturan, Marco Marelli, Marco Maru, Maria Jose Ramírez Quintana, Marie Tolkiehn, Mario Giulianelli, Martha Lewis, Martin Potthast, Matthew L. Leavitt, Matthias Hagen, Mátyás Schubert, Medina Orduna Baitemirova, Melody Arnaud, Melvin McElrath, Michael A. Yee, Michael Cohen, Michael Gu, Michael Ivanitskiy, Michael Starritt, Michael Strube, Michał Swędrowski, Michele Bevilacqua, Michihiro Yasunaga, Mihir Kale, Mike Cain, Mimee Xu, Mirac Suzgun, Mitch Walker, Mo Tiwari, Mohit Bansal, Moin Aminnaseri, Mor Geva, Mozhdeh Gheini, Mukund Varma T, Nanyun Peng, Nathan A. Chi, Nayeon Lee, Neta Gur-Ari Krakover, Nicholas Cameron, Nicholas Roberts, Nick Doiron, Nicole Martinez, Nikita Nangia, Niklas Deckers, Niklas Muennighoff, Nitish Shirish Keskar, Niveditha S. Iyer, Noah Constant, Noah Fiedel, Nuan Wen, Oliver Zhang, Omar Agha, Omar Elbaghdadi, Omer Levy, Owain Evans, Pablo Antonio Moreno Casares, Parth Doshi, Pascale Fung, Paul Pu Liang, Paul Vicol, Pegah Alipoormolabashi, Peiyuan Liao, Percy Liang, Peter Chang, Peter Eckersley, Phu Mon Htut, Pinyu Hwang, Piotr Miłkowski, Piyush Patil, Pouya Pezeshkpour, Priti Oli, Qiaozhu Mei, Qing Lyu, Qinlang Chen, Rabin Banjade, Rachel Etta Rudolph, Raefer Gabriel, Rahel Habacker, Ramon Risco, Raphaël Millière, Rhythm Garg, Richard Barnes, Rif A. Saurous, Riku Arakawa, Robbe Raymaekers, Robert Frank, Rohan Sikand, Roman Novak, Roman Sitelew, Ronan LeBras, Rosanne Liu, Rowan Jacobs, Rui Zhang, Ruslan Salakhutdinov, Ryan Chi, Ryan Lee, Ryan Stovall, Ryan Teehan, Rylan Yang, Sahib Singh, Saif M. Mohammad, Sajant Anand, Sam Dillavou, Sam Shleifer, Sam Wiseman, Samuel Gruetter, Samuel R. Bowman, Samuel S. Schoenholz, Sanghyun Han, Sanjeev Kwatra, Sarah A. Rous, Sarik Ghazarian, Sayan Ghosh, Sean Casey, Sebastian Bischoff, Sebastian Gehrmann, Sebastian Schuster, Sepideh Sadeghi, Shadi Hamdan, Sharon Zhou, Shashank Srivastava, Sherry Shi, Shikhar Singh, Shima Asaadi, Shixiang Shane Gu, Shubh Pachchigar, Shubham Toshniwal, Shyam Upadhyay, Shyamolima, Debnath, Siamak Shakeri, Simon Thormeyer, Simone Melzi, Siva Reddy, Sneha Priscilla Makini, Soo-Hwan Lee, Spencer Torene, Sriharsha Hatwar, Stanislas Dehaene, Stefan Divic, Stefano Ermon, Stella Biderman, Stephanie Lin, Stephen Prasad, Steven T. Piantadosi, Stuart M. Shieber, Summer Misherghi, Svetlana Kiritchenko, Swaroop Mishra, Tal Linzen, Tal Schuster, Tao Li, Tao Yu, Tariq Ali, Tatsu Hashimoto, Te-Lin Wu, Théo Desbordes, Theodore Rothschild, Thomas Phan, Tianle Wang, Tiberius Nkinyili, Timo Schick, Timofei Kornev, Titus Tunduny, Tobias Gerstenberg, Trenton Chang, Trishala Neeraj, Tushar Khot, Tyler Shultz, Uri Shaham, Vedant Misra, Vera Demberg, Victoria Nyamai, Vikas Raunak, Vinay Ramasesh, Vinay Uday Prabhu, Vishakh Padmakumar, Vivek Srikumar, William Fedus, William Saunders, William Zhang, Wout Vossen, Xiang Ren, Xiaoyu Tong, Xinran Zhao, Xinyi Wu, Xudong Shen, Yadollah Yaghoobzadeh, Yair Lakretz, Yangqiu Song, Yasaman Bahri, Yejin Choi, Yichi Yang, Yiding Hao, Yifu Chen, Yonatan Belinkov, Yu Hou, Yufang Hou, Yuntao Bai, Zachary Seid, Zhuoye Zhao, Zijian Wang, Zijie J. Wang, ZiRui Wang, Ziyi Wu

BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models.

Common Sense Reasoning Math +1

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

13 code implementations27 May 2022 Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré

We also extend FlashAttention to block-sparse attention, yielding an approximate attention algorithm that is faster than any existing approximate attention method.

16k 4k +4

Training and Inference on Any-Order Autoregressive Models the Right Way

1 code implementation26 May 2022 Andy Shih, Dorsa Sadigh, Stefano Ermon

Conditional inference on arbitrary subsets of variables is a core problem in probabilistic inference with important applications such as masked language modeling and image inpainting.

Image Inpainting Language Modeling +2

Self-Similarity Priors: Neural Collages as Differentiable Fractal Representations

no code implementations15 Apr 2022 Michael Poli, Winnie Xu, Stefano Massaroli, Chenlin Meng, Kuno Kim, Stefano Ermon

We investigate how to leverage the representations produced by Neural Collages in various tasks, including data compression and generation.

Data Compression

Tracking Urbanization in Developing Regions with Remote Sensing Spatial-Temporal Super-Resolution

no code implementations4 Apr 2022 Yutong He, William Zhang, Chenlin Meng, Marshall Burke, David B. Lobell, Stefano Ermon

Automated tracking of urban development in areas where construction information is not available became possible with recent advancements in machine learning and remote sensing.

Image Super-Resolution Object Tracking +2

Generative Modeling Helps Weak Supervision (and Vice Versa)

1 code implementation22 Mar 2022 Benedikt Boecking, Nicholas Roberts, Willie Neiswanger, Stefano Ermon, Frederic Sala, Artur Dubrawski

The model outperforms baseline weak supervision label models on a number of multiclass image classification datasets, improves the quality of generated images, and further improves end-model performance through data augmentation with synthetic samples.

Data Augmentation image-classification +1

Dual Diffusion Implicit Bridges for Image-to-Image Translation

1 code implementation16 Mar 2022 Xuan Su, Jiaming Song, Chenlin Meng, Stefano Ermon

Image translation with DDIBs relies on two diffusion models trained independently on each domain, and is a two-step process: DDIBs first obtain latent encodings for source images with the source diffusion model, and then decode such encodings using the target model to construct target images.

Image-to-Image Translation Translation

GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation

2 code implementations ICLR 2022 Minkai Xu, Lantao Yu, Yang song, Chence Shi, Stefano Ermon, Jian Tang

GeoDiff treats each atom as a particle and learns to directly reverse the diffusion process (i. e., transforming from a noise distribution to stable conformations) as a Markov chain.

Drug Discovery

LISA: Learning Interpretable Skill Abstractions from Language

1 code implementation28 Feb 2022 Divyansh Garg, Skanda Vaidyanath, Kuno Kim, Jiaming Song, Stefano Ermon

Learning policies that effectively utilize language instructions in complex, multi-task environments is an important problem in sequential decision-making.

Imitation Learning Quantization +1

Imitation Learning by Estimating Expertise of Demonstrators

1 code implementation2 Feb 2022 Mark Beliaev, Andy Shih, Stefano Ermon, Dorsa Sadigh, Ramtin Pedarsani

In this work, we show that unsupervised learning over demonstrator expertise can lead to a consistent boost in the performance of imitation learning algorithms.

continuous-control Continuous Control +1

Denoising Diffusion Restoration Models

1 code implementation27 Jan 2022 Bahjat Kawar, Michael Elad, Stefano Ermon, Jiaming Song

Many interesting tasks in image restoration can be cast as linear inverse problems.

Colorization Deblurring +4

Conditional Imitation Learning for Multi-Agent Games

no code implementations5 Jan 2022 Andy Shih, Stefano Ermon, Dorsa Sadigh

In this work, we study the problem of conditional multi-agent imitation learning, where we have access to joint trajectory demonstrations at training time, and we must interact with and adapt to new partners at test time.

Imitation Learning Tensor Decomposition

IS-COUNT: Large-scale Object Counting from Satellite Images with Covariate-based Importance Sampling

1 code implementation16 Dec 2021 Chenlin Meng, Enci Liu, Willie Neiswanger, Jiaming Song, Marshall Burke, David Lobell, Stefano Ermon

We show empirically that the proposed framework achieves strong performance on estimating the number of buildings in the United States and Africa, cars in Kenya, brick kilns in Bangladesh, and swimming pools in the U. S., while requiring as few as 0. 01% of satellite images compared to an exhaustive approach.

Object Object Counting +3

Quantifying and Understanding Adversarial Examples in Discrete Input Spaces

no code implementations12 Dec 2021 Volodymyr Kuleshov, Evgenii Nikishin, Shantanu Thakoor, Tingfung Lau, Stefano Ermon

In this work, we seek to understand and extend adversarial examples across domains in which inputs are discrete, particularly across new domains, such as computational biology.

Attribute Sentiment Analysis

An Experimental Design Perspective on Model-Based Reinforcement Learning

2 code implementations9 Dec 2021 Viraj Mehta, Biswajit Paria, Jeff Schneider, Stefano Ermon, Willie Neiswanger

In particular, we leverage ideas from Bayesian optimal experimental design to guide the selection of state-action queries for efficient learning.

continuous-control Continuous Control +5

A Unified Framework for Multi-distribution Density Ratio Estimation

no code implementations7 Dec 2021 Lantao Yu, Yujia Jin, Stefano Ermon

Binary density ratio estimation (DRE), the problem of estimating the ratio $p_1/p_2$ given their empirical samples, provides the foundation for many state-of-the-art machine learning algorithms such as contrastive representation learning and covariate shift adaptation.

Density Ratio Estimation Representation Learning +1

BCD Nets: Scalable Variational Approaches for Bayesian Causal Discovery

1 code implementation NeurIPS 2021 Chris Cundy, Aditya Grover, Stefano Ermon

We propose Bayesian Causal Discovery Nets (BCD Nets), a variational inference framework for estimating a distribution over DAGs characterizing a linear-Gaussian SEM.

Causal Discovery Stochastic Optimization +1

HyperSPNs: Compact and Expressive Probabilistic Circuits

1 code implementation NeurIPS 2021 Andy Shih, Dorsa Sadigh, Stefano Ermon

Probabilistic circuits (PCs) are a family of generative models which allows for the computation of exact likelihoods and marginals of its probability distributions.

Density Estimation

D2C: Diffusion-Decoding Models for Few-Shot Conditional Generation

1 code implementation NeurIPS 2021 Abhishek Sinha, Jiaming Song, Chenlin Meng, Stefano Ermon

Conditional generative models of high-dimensional images have many applications, but supervision signals from conditions to images can be expensive to acquire.

Conditional Image Generation Image Manipulation +1

Reliable Decisions with Threshold Calibration

no code implementations NeurIPS 2021 Roshni Sahoo, Shengjia Zhao, Alyssa Chen, Stefano Ermon

We propose a stronger notion of calibration called threshold calibration, which is exactly the condition required to ensure that decision loss is predicted accurately for threshold decisions.

Scheduling

Density Ratio Estimation via Infinitesimal Classification

2 code implementations22 Nov 2021 Kristy Choi, Chenlin Meng, Yang song, Stefano Ermon

We then estimate the instantaneous rate of change of the bridge distributions indexed by time (the "time score") -- a quantity defined analogously to data (Stein) scores -- with a novel time score matching objective.

Classification Density Ratio Estimation +1

Solving Inverse Problems in Medical Imaging with Score-Based Generative Models

1 code implementation NeurIPS Workshop Deep_Invers 2021 Yang song, Liyue Shen, Lei Xing, Stefano Ermon

These measurements are typically synthesized from images using a fixed physical model of the measurement process, which hinders the generalization capability of models to unknown measurement processes.

Computed Tomography (CT)

Estimating High Order Gradients of the Data Distribution by Denoising

no code implementations NeurIPS 2021 Chenlin Meng, Yang song, Wenzhe Li, Stefano Ermon

By leveraging Tweedie's formula on higher order moments, we generalize denoising score matching to estimate higher order derivatives.

Audio Synthesis Denoising +2

SustainBench: Benchmarks for Monitoring the Sustainable Development Goals with Machine Learning

1 code implementation8 Nov 2021 Christopher Yeh, Chenlin Meng, Sherrie Wang, Anne Driscoll, Erik Rozi, Patrick Liu, Jihyeon Lee, Marshall Burke, David B. Lobell, Stefano Ermon

Our goals for SustainBench are to (1) lower the barriers to entry for the machine learning community to contribute to measuring and achieving the SDGs; (2) provide standard benchmarks for evaluating machine learning models on tasks across a variety of SDGs; and (3) encourage the development of novel machine learning methods where improved model performance facilitates progress towards the SDGs.

BIG-bench Machine Learning

Equivariant Neural Network for Factor Graphs

no code implementations29 Sep 2021 Fan-Yun Sun, Jonathan Kuck, Hao Tang, Stefano Ermon

Several indices used in a factor graph data structure can be permuted without changing the underlying probability distribution.

Inductive Bias

H-Entropy Search: Generalizing Bayesian Optimization with a Decision-theoretic Uncertainty Measure

no code implementations29 Sep 2021 Willie Neiswanger, Lantao Yu, Shengjia Zhao, Chenlin Meng, Stefano Ermon

For special cases of the loss and design space, we develop gradient-based methods to efficiently optimize our proposed family of acquisition functions, and demonstrate that the resulting BO procedure shows strong empirical performance on a diverse set of optimization tasks.

Bayesian Optimization

An Experimental Design Perspective on Exploration in Reinforcement Learning

no code implementations ICLR 2022 Viraj Mehta, Biswajit Paria, Jeff Schneider, Willie Neiswanger, Stefano Ermon

In particular, we leverage ideas from Bayesian optimal experimental design to guide the selection of state-action queries for efficient learning.

continuous-control Continuous Control +4

Sphere2Vec: Self-Supervised Location Representation Learning on Spherical Surfaces

no code implementations29 Sep 2021 Gengchen Mai, Yao Xuan, Wenyun Zuo, Yutong He, Stefano Ermon, Jiaming Song, Krzysztof Janowicz, Ni Lao

Location encoding is valuable for a multitude of tasks where both the absolute positions and local contexts (image, text, and other types of metadata) of spatial objects are needed for accurate predictions.

image-classification Image Classification +2

Provably Calibrated Regression Under Distribution Drift

no code implementations29 Sep 2021 Shengjia Zhao, Yusuke Tashiro, Danny Tse, Stefano Ermon

Accurate uncertainty quantification is a key building block of trustworthy machine learning systems.

Prediction regression +3

Mind Your Bits and Errors: Prioritizing the Bits that Matter in Variational Autoencoders

no code implementations29 Sep 2021 Rui Shu, Stefano Ermon

In this work, we consider the task of image generative modeling with variational autoencoders and posit that the nature of high-dimensional image data distributions poses an intrinsic challenge.

On the Opportunities and Risks of Foundation Models

2 code implementations16 Aug 2021 Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang

AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.

Transfer Learning

SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations

2 code implementations ICLR 2022 Chenlin Meng, Yutong He, Yang song, Jiaming Song, Jiajun Wu, Jun-Yan Zhu, Stefano Ermon

The key challenge is balancing faithfulness to the user input (e. g., hand-drawn colored strokes) and realism of the synthesized image.

Denoising Image Generation

Calibrating Predictions to Decisions: A Novel Approach to Multi-Class Calibration

no code implementations NeurIPS 2021 Shengjia Zhao, Michael P. Kim, Roshni Sahoo, Tengyu Ma, Stefano Ermon

In this work, we introduce a new notion -- \emph{decision calibration} -- that requires the predicted distribution and true distribution to be ``indistinguishable'' to a set of downstream decision-makers.

Decision Making

Multi-Agent Imitation Learning with Copulas

no code implementations10 Jul 2021 Hongwei Wang, Lantao Yu, Zhangjie Cao, Stefano Ermon

Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions, which is essential for understanding physical, social, and team-play systems.

Imitation Learning

CSDI: Conditional Score-based Diffusion Models for Probabilistic Time Series Imputation

5 code implementations NeurIPS 2021 Yusuke Tashiro, Jiaming Song, Yang song, Stefano Ermon

In this paper, we propose Conditional Score-based Diffusion models for Imputation (CSDI), a novel time series imputation method that utilizes score-based diffusion models conditioned on observed data.

Audio Synthesis Image Generation +4

Featurized Density Ratio Estimation

1 code implementation5 Jul 2021 Kristy Choi, Madeline Liao, Stefano Ermon

Density ratio estimation serves as an important technique in the unsupervised machine learning toolbox.

Data Augmentation Density Ratio Estimation +1

IQ-Learn: Inverse soft-Q Learning for Imitation

5 code implementations NeurIPS 2021 Divyansh Garg, Shuvam Chakraborty, Chris Cundy, Jiaming Song, Matthieu Geist, Stefano Ermon

In many sequential decision-making problems (e. g., robotics control, game playing, sequential prediction), human or expert data is available containing useful information about the task.

Atari Games Continuous Control +4

Spatial-Temporal Super-Resolution of Satellite Imagery via Conditional Pixel Synthesis

1 code implementation NeurIPS 2021 Yutong He, Dingjie Wang, Nicholas Lai, William Zhang, Chenlin Meng, Marshall Burke, David B. Lobell, Stefano Ermon

High-resolution satellite imagery has proven useful for a broad range of tasks, including measurement of global human population, local economic livelihoods, and biodiversity, among many others.

Object Counting Super-Resolution

Temporal Predictive Coding For Model-Based Planning In Latent Space

3 code implementations14 Jun 2021 Tung Nguyen, Rui Shu, Tuan Pham, Hung Bui, Stefano Ermon

High-dimensional observations are a major challenge in the application of model-based reinforcement learning (MBRL) to real-world environments.

Model-based Reinforcement Learning Representation Learning

D2C: Diffusion-Denoising Models for Few-shot Conditional Generation

3 code implementations12 Jun 2021 Abhishek Sinha, Jiaming Song, Chenlin Meng, Stefano Ermon

Conditional generative models of high-dimensional images have many applications, but supervision signals from conditions to images can be expensive to acquire.

Conditional Image Generation Denoising +2

Improving Compositionality of Neural Networks by Decoding Representations to Inputs

no code implementations NeurIPS 2021 Mike Wu, Noah Goodman, Stefano Ermon

In traditional software programs, it is easy to trace program logic from variables back to input, apply assertion statements to block erroneous behavior, and compose programs together.

Fairness Out-of-Distribution Detection

Bayesian Algorithm Execution: Estimating Computable Properties of Black-box Functions Using Mutual Information

1 code implementation19 Apr 2021 Willie Neiswanger, Ke Alexander Wang, Stefano Ermon

Given such an $\mathcal{A}$, and a prior distribution over $f$, we refer to the problem of inferring the output of $\mathcal{A}$ using $T$ evaluations as Bayesian Algorithm Execution (BAX).

Bayesian Optimization Experimental Design +2

On the Critical Role of Conventions in Adaptive Human-AI Collaboration

1 code implementation ICLR 2021 Andy Shih, Arjun Sawhney, Jovana Kondic, Stefano Ermon, Dorsa Sadigh

Humans can quickly adapt to new partners in collaborative tasks (e. g. playing basketball), because they understand which fundamental skills of the task (e. g. how to dribble, how to shoot) carry over across new partners.

Hybrid Mutual Information Lower-bound Estimators for Representation Learning

no code implementations ICLR Workshop Neural_Compression 2021 Abhishek Sinha, Jiaming Song, Stefano Ermon

We illustrate that with one set of representations, the hybrid approach is able to achieve good performance on multiple downstream tasks such as classification, reconstruction, and generation.

Representation Learning

Anytime Sampling for Autoregressive Models via Ordered Autoencoding

1 code implementation ICLR 2021 Yilun Xu, Yang song, Sahaj Garg, Linyuan Gong, Rui Shu, Aditya Grover, Stefano Ermon

Experimentally, we demonstrate in several image and audio generation tasks that sample quality degrades gracefully as we reduce the computational budget for sampling.

Audio Generation Computational Efficiency

Neural Network Compression for Noisy Storage Devices

no code implementations15 Feb 2021 Berivan Isik, Kristy Choi, Xin Zheng, Tsachy Weissman, Stefano Ermon, H. -S. Philip Wong, Armin Alaghi

Compression and efficient storage of neural network (NN) parameters is critical for applications that run on resource-constrained devices.

Neural Network Compression

Negative Data Augmentation

2 code implementations ICLR 2021 Abhishek Sinha, Kumar Ayush, Jiaming Song, Burak Uzkent, Hongxia Jin, Stefano Ermon

Empirically, models trained with our method achieve improved conditional/unconditional image generation along with improved anomaly detection capabilities.

Action Recognition Anomaly Detection +10

Maximum Likelihood Training of Score-Based Diffusion Models

3 code implementations NeurIPS 2021 Yang song, Conor Durkan, Iain Murray, Stefano Ermon

Score-based diffusion models synthesize samples by reversing a stochastic process that diffuses data to noise, and are trained by minimizing a weighted combination of score matching losses.

Ranked #10 on Image Generation on ImageNet 32x32 (bpd metric)

Data Augmentation Image Generation

H-divergence: A Decision-Theoretic Discrepancy Measure for Two Sample Tests

no code implementations1 Jan 2021 Shengjia Zhao, Abhishek Sinha, Yutong He, Aidan Perreault, Jiaming Song, Stefano Ermon

Based on ideas from decision theory, we investigate a new class of discrepancies that are based on the optimal decision loss.

Vocal Bursts Valence Prediction

Non-Markovian Predictive Coding For Planning In Latent Space

no code implementations1 Jan 2021 Tung Nguyen, Rui Shu, Tuan Pham, Hung Bui, Stefano Ermon

High-dimensional observations are a major challenge in the application of model-based reinforcement learning (MBRL) to real-world environments.

Model-based Reinforcement Learning Representation Learning

Understanding Classifiers with Generative Models

no code implementations1 Jan 2021 Laëtitia Shao, Yang song, Stefano Ermon

Although deep neural networks are effective on supervised learning tasks, they have been shown to be brittle.

Two-sample testing

Privacy-Constrained Policies via Mutual Information Regularized Policy Gradients

no code implementations30 Dec 2020 Chris Cundy, Rishi Desai, Stefano Ermon

We consider the task of training a policy that maximizes reward while minimizing disclosure of certain sensitive state variables through the actions.

Decision Making Sequential Decision Making

PiRank: Scalable Learning To Rank via Differentiable Sorting

1 code implementation NeurIPS 2021 Robin Swezey, Aditya Grover, Bruno Charron, Stefano Ermon

A key challenge with machine learning approaches for ranking is the gap between the performance metrics of interest and the surrogate loss functions that can be optimized with gradient-based methods.

Learning-To-Rank

Score-Based Generative Modeling through Stochastic Differential Equations

15 code implementations ICLR 2021 Yang song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, Ben Poole

Combined with multiple architectural improvements, we achieve record-breaking performance for unconditional image generation on CIFAR-10 with an Inception score of 9. 89 and FID of 2. 20, a competitive likelihood of 2. 99 bits/dim, and demonstrate high fidelity generation of 1024 x 1024 images for the first time from a score-based generative model.

Colorization Density Estimation +2

Efficient Conditional Pre-training for Transfer Learning

no code implementations20 Nov 2020 Shuvam Chakraborty, Burak Uzkent, Kumar Ayush, Kumar Tanmay, Evan Sheehan, Stefano Ermon

Finally, we improve standard ImageNet pre-training by 1-3% by tuning available models on our subsets and pre-training on a dataset filtered from a larger scale dataset.

Transfer Learning

Geography-Aware Self-Supervised Learning

1 code implementation ICCV 2021 Kumar Ayush, Burak Uzkent, Chenlin Meng, Kumar Tanmay, Marshall Burke, David Lobell, Stefano Ermon

Contrastive learning methods have significantly narrowed the gap between supervised and unsupervised learning on computer vision tasks.

Ranked #6 on Semantic Segmentation on SpaceNet 1 (using extra training data)

Contrastive Learning image-classification +5

Autoregressive Score Matching

no code implementations NeurIPS 2020 Chenlin Meng, Lantao Yu, Yang song, Jiaming Song, Stefano Ermon

To increase flexibility, we propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariate log-conditionals (scores), which need not be normalized.

Density Estimation Image Denoising +1

Probabilistic Circuits for Variational Inference in Discrete Graphical Models

1 code implementation NeurIPS 2020 Andy Shih, Stefano Ermon

Inference in discrete graphical models with variational methods is difficult because of the inability to re-parameterize gradients of the Evidence Lower Bound (ELBO).

Variational Inference

Imitation with Neural Density Models

no code implementations NeurIPS 2021 Kuno Kim, Akshat Jindal, Yang song, Jiaming Song, Yanan Sui, Stefano Ermon

We propose a new framework for Imitation Learning (IL) via density estimation of the expert's occupancy measure followed by Maximum Occupancy Entropy Reinforcement Learning (RL) using the density as a reward.

Density Estimation Imitation Learning +3

Denoising Diffusion Implicit Models

29 code implementations ICLR 2021 Jiaming Song, Chenlin Meng, Stefano Ermon

Denoising diffusion probabilistic models (DDPMs) have achieved high quality image generation without adversarial training, yet they require simulating a Markov chain for many steps to produce a sample.

Denoising Image Generation

Understanding Classifier Mistakes with Generative Models

no code implementations5 Oct 2020 Laëtitia Shao, Yang song, Stefano Ermon

From this observation, we develop a detection criteria for samples on which a classifier is likely to fail at test time.

Two-sample testing

Privacy Preserving Recalibration under Domain Shift

no code implementations21 Aug 2020 Rachel Luo, Shengjia Zhao, Jiaming Song, Jonathan Kuck, Stefano Ermon, Silvio Savarese

In an extensive empirical study, we find that our algorithm improves calibration on domain-shift benchmarks under the constraints of differential privacy.

Privacy Preserving

Multi-label Contrastive Predictive Coding

no code implementations NeurIPS 2020 Jiaming Song, Stefano Ermon

We demonstrate that the proposed approach is able to lead to better mutual information estimation, gain empirical improvements in unsupervised representation learning, and beat a current state-of-the-art knowledge distillation method over 10 out of 13 tasks.

Knowledge Distillation Multi-class Classification +5

Efficient Learning of Generative Models via Finite-Difference Score Matching

1 code implementation NeurIPS 2020 Tianyu Pang, Kun Xu, Chongxuan Li, Yang song, Stefano Ermon, Jun Zhu

Several machine learning applications involve the optimization of higher-order derivatives (e. g., gradients of gradients) during training, which can be expensive in respect to memory and computation even with automatic differentiation.

Unsupervised Calibration under Covariate Shift

no code implementations29 Jun 2020 Anusri Pampari, Stefano Ermon

A probabilistic model is said to be calibrated if its predicted probabilities match the corresponding empirical frequencies.

Decision Making Domain Adaptation +1

Experience Replay with Likelihood-free Importance Weights

1 code implementation23 Jun 2020 Samarth Sinha, Jiaming Song, Animesh Garg, Stefano Ermon

The use of past experiences to accelerate temporal difference (TD) learning of value functions, or experience replay, is a key component in deep reinforcement learning.

Deep Reinforcement Learning OpenAI Gym +2

A Framework for Sample Efficient Interval Estimation with Control Variates

1 code implementation18 Jun 2020 Shengjia Zhao, Christopher Yeh, Stefano Ermon

We consider the problem of estimating confidence intervals for the mean of a random variable, where the goal is to produce the smallest possible interval for a given number of samples.

regression

Individual Calibration with Randomized Forecasting

no code implementations ICML 2020 Shengjia Zhao, Tengyu Ma, Stefano Ermon

We show that calibration for individual samples is possible in the regression setup if the predictions are randomized, i. e. outputting randomized credible intervals.

Decision Making Fairness +1

Improved Techniques for Training Score-Based Generative Models

9 code implementations NeurIPS 2020 Yang Song, Stefano Ermon

Score-based generative models can produce high quality image samples comparable to GANs, without requiring adversarial optimization.

Image Generation

Predicting Livelihood Indicators from Community-Generated Street-Level Imagery

1 code implementation15 Jun 2020 Jihyeon Lee, Dylan Grosz, Burak Uzkent, Sicheng Zeng, Marshall Burke, David Lobell, Stefano Ermon

Major decisions from governments and other large organizations rely on measurements of the populace's well-being, but making such measurements at a broad scale is expensive and thus infrequent in much of the developing world.

Efficient Poverty Mapping using Deep Reinforcement Learning

no code implementations7 Jun 2020 Kumar Ayush, Burak Uzkent, Kumar Tanmay, Marshall Burke, David Lobell, Stefano Ermon

The combination of high-resolution satellite imagery and machine learning have proven useful in many sustainability-related tasks, including poverty prediction, infrastructure measurement, and forest monitoring.

Deep Reinforcement Learning object-detection +3

Evaluating the Disentanglement of Deep Generative Models through Manifold Topology

1 code implementation ICLR 2021 Sharon Zhou, Eric Zelikman, Fred Lu, Andrew Y. Ng, Gunnar Carlsson, Stefano Ermon

Learning disentangled representations is regarded as a fundamental task for improving the generalization, robustness, and interpretability of generative models.

Disentanglement

Farmland Parcel Delineation Using Spatio-temporal Convolutional Networks

no code implementations11 Apr 2020 Han Lin Aung, Burak Uzkent, Marshall Burke, David Lobell, Stefano Ermon

Using satellite imaging can be a scalable and cost effective manner to perform the task of farm parcel delineation to collect this valuable data.

Segmentation

Diversity can be Transferred: Output Diversification for White- and Black-box Attacks

1 code implementation NeurIPS 2020 Yusuke Tashiro, Yang song, Stefano Ermon

Adversarial attacks often involve random perturbations of the inputs drawn from uniform or Gaussian distributions, e. g., to initialize optimization-based white-box attacks or generate update directions in black-box attacks.

Diversity

Training Deep Energy-Based Models with f-Divergence Minimization

1 code implementation ICML 2020 Lantao Yu, Yang song, Jiaming Song, Stefano Ermon

Experimental results demonstrate the superiority of f-EBM over contrastive divergence, as well as the benefits of training EBMs using f-divergences other than KL.

Gaussianization Flows

3 code implementations4 Mar 2020 Chenlin Meng, Yang song, Jiaming Song, Stefano Ermon

Iterative Gaussianization is a fixed-point iteration procedure that can transform any continuous random vector into a Gaussian one.

Predictive Coding for Locally-Linear Control

1 code implementation ICML 2020 Rui Shu, Tung Nguyen, Yin-Lam Chow, Tuan Pham, Khoat Than, Mohammad Ghavamzadeh, Stefano Ermon, Hung H. Bui

High-dimensional observations and unknown dynamics are major challenges when applying optimal control to many real-world decision making tasks.

Decision Making Decoder

Permutation Invariant Graph Generation via Score-Based Generative Modeling

1 code implementation2 Mar 2020 Chenhao Niu, Yang song, Jiaming Song, Shengjia Zhao, Aditya Grover, Stefano Ermon

In particular, we design a permutation equivariant, multi-channel graph neural network to model the gradient of the data distribution at the input graph (a. k. a., the score function).

Graph Generation Graph Neural Network

Learning When and Where to Zoom with Deep Reinforcement Learning

2 code implementations CVPR 2020 Burak Uzkent, Stefano Ermon

While high resolution images contain semantically more useful information than their lower resolution counterparts, processing them is computationally more expensive, and in some applications, e. g. remote sensing, they can be much more expensive to acquire.

Deep Reinforcement Learning reinforcement-learning +1

Accelerating Feedforward Computation via Parallel Nonlinear Equation Solving

1 code implementation10 Feb 2020 Yang Song, Chenlin Meng, Renjie Liao, Stefano Ermon

Feedforward computation, such as evaluating a neural network or sampling from an autoregressive model, is ubiquitous in machine learning.

Cannot find the paper you are looking for? You can Submit a new open access paper.