Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition

no code implementations21 Mar 2024 Sihyun Yu, Weili Nie, De-An Huang, Boyi Li, Jinwoo Shin, Anima Anandkumar

To tackle this issue, we propose content-motion latent diffusion model (CMD), a novel efficient extension of pretrained image diffusion models for video generation.

Video Generation

T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching

1 code implementation21 Feb 2024 Zizheng Pan, Bohan Zhuang, De-An Huang, Weili Nie, Zhiding Yu, Chaowei Xiao, Jianfei Cai, Anima Anandkumar

Sampling from diffusion probabilistic models (DPMs) is often expensive for high-quality image generation and typically requires many steps with a large model.

Image Generation

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

no code implementations6 Oct 2023 Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri, Rao Kotamarthi, Venkatram Vishwanath, Arvind Ramanathan, Sam Foreman, Kyle Hippe, Troy Arcomano, Romit Maulik, Maxim Zvyagin, Alexander Brace, Bin Zhang, Cindy Orozco Bohorquez, Austin Clyde, Bharat Kale, Danilo Perez-Rivera, Heng Ma, Carla M. Mann, Michael Irvin, J. Gregory Pauloski, Logan Ward, Valerie Hayot, Murali Emani, Zhen Xie, Diangen Lin, Maulik Shukla, Ian Foster, James J. Davis, Michael E. Papka, Thomas Brettin, Prasanna Balaprakash, Gina Tourassi, John Gounley, Heidi Hanson, Thomas E Potok, Massimiliano Lupo Pasini, Kate Evans, Dan Lu, Dalton Lunga, Junqi Yin, Sajal Dash, Feiyi Wang, Mallikarjun Shankar, Isaac Lyngaas, Xiao Wang, Guojing Cong, Pei Zhang, Ming Fan, Siyan Liu, Adolfy Hoisie, Shinjae Yoo, Yihui Ren, William Tang, Kyle Felker, Alexey Svyatkovskiy, Hang Liu, Ashwin Aji, Angela Dalton, Michael Schulte, Karl Schulz, Yuntian Deng, Weili Nie, Josh Romero, Christian Dallago, Arash Vahdat, Chaowei Xiao, Thomas Gibbs, Anima Anandkumar, Rick Stevens

In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences.

Improving Generative Model-based Unfolding with Schrödinger Bridges

1 code implementation23 Aug 2023 Sascha Diefenbacher, Guan-Horng Liu, Vinicius Mikuni, Benjamin Nachman, Weili Nie

Machine learning-based unfolding has enabled unbinned and high-dimensional differential cross section measurements.

Fast Training of Diffusion Models with Masked Transformers

1 code implementation15 Jun 2023 Hongkai Zheng, Weili Nie, Arash Vahdat, Anima Anandkumar

For masked training, we introduce an asymmetric encoder-decoder architecture consisting of a transformer encoder that operates only on unmasked patches and a lightweight transformer decoder on full patches.

Denoising Representation Learning

Defending against Adversarial Audio via Diffusion Model

1 code implementation2 Mar 2023 Shutong Wu, Jiongxiao Wang, Wei Ping, Weili Nie, Chaowei Xiao

In this paper, we propose an adversarial purification-based defense pipeline, AudioPure, for acoustic systems via off-the-shelf diffusion models.

I$^2$SB: Image-to-Image Schrödinger Bridge

1 code implementation12 Feb 2023 Guan-Horng Liu, Arash Vahdat, De-An Huang, Evangelos A. Theodorou, Weili Nie, Anima Anandkumar

We propose Image-to-Image Schr\"odinger Bridge (I$^2$SB), a new class of conditional diffusion models that directly learn the nonlinear diffusion processes between two given distributions.

Deblurring Image Restoration +1

Multi-modal Molecule Structure-text Model for Text-based Retrieval and Editing

1 code implementation21 Dec 2022 Shengchao Liu, Weili Nie, Chengpeng Wang, Jiarui Lu, Zhuoran Qiao, Ling Liu, Jian Tang, Chaowei Xiao, Anima Anandkumar

Here we present a multi-modal molecule structure-text model, MoleculeSTM, by jointly learning molecules' chemical structures and textual descriptions via a contrastive learning strategy.

Contrastive Learning Drug Discovery +2

DensePure: Understanding Diffusion Models towards Adversarial Robustness

no code implementations1 Nov 2022 Chaowei Xiao, Zhongzhu Chen, Kun Jin, Jiongxiao Wang, Weili Nie, Mingyan Liu, Anima Anandkumar, Bo Li, Dawn Song

By using the highest density point in the conditional distribution as the reversed sample, we identify the robust region of a given instance under the diffusion model's reverse process.

Adversarial Robustness Denoising

Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models

2 code implementations15 Sep 2022 Manli Shu, Weili Nie, De-An Huang, Zhiding Yu, Tom Goldstein, Anima Anandkumar, Chaowei Xiao

In evaluating cross-dataset generalization with unseen categories, TPT performs on par with the state-of-the-art approaches that use additional training data.

Image Classification Zero-shot Generalization

Retrieval-based Controllable Molecule Generation

1 code implementation23 Aug 2022 Zichao Wang, Weili Nie, Zhuoran Qiao, Chaowei Xiao, Richard Baraniuk, Anima Anandkumar

On various tasks ranging from simple design criteria to a challenging real-world scenario for designing lead compounds that bind to the SARS-CoV-2 main protease, we demonstrate our approach extrapolates well beyond the retrieval database, and achieves better performance and wider applicability than previous methods.

Drug Discovery Retrieval

PointDP: Diffusion-driven Purification against Adversarial Attacks on 3D Point Cloud Recognition

no code implementations21 Aug 2022 Jiachen Sun, Weili Nie, Zhiding Yu, Z. Morley Mao, Chaowei Xiao

3D Point cloud is becoming a critical data representation in many real-world applications like autonomous driving, robotics, and medical imaging.

Autonomous Driving

Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions

1 code implementation CVPR 2022 Huaizu Jiang, Xiaojian Ma, Weili Nie, Zhiding Yu, Yuke Zhu, Song-Chun Zhu, Anima Anandkumar

A significant gap remains between today's visual pattern recognition models and human-level visual cognition especially when it comes to few-shot learning and compositional reasoning of novel concepts.

Benchmarking Few-Shot Image Classification +5

Diffusion Models for Adversarial Purification

2 code implementations16 May 2022 Weili Nie, Brandon Guo, Yujia Huang, Chaowei Xiao, Arash Vahdat, Anima Anandkumar

Adversarial purification refers to a class of defense methods that remove adversarial perturbations using a generative model.

RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning

1 code implementation ICLR 2022 Xiaojian Ma, Weili Nie, Zhiding Yu, Huaizu Jiang, Chaowei Xiao, Yuke Zhu, Song-Chun Zhu, Anima Anandkumar

This task remains challenging for current deep learning algorithms since it requires addressing three key technical problems jointly: 1) identifying object entities and their properties, 2) inferring semantic relations between pairs of entities, and 3) generalizing to novel object-relation combinations, i. e., systematic generalization.

Human-Object Interaction Detection Object +5

A Step-Wise Weighting Approach for Controllable Text Generation

no code implementations29 Sep 2021 Zichao Wang, Weili Nie, Zhenwei Dai, Richard Baraniuk

Many existing approaches either require extensive training/fine-tuning of the LM for each single attribute under control or are slow to generate text.

Attribute Language Modelling +1

An Improved Semi-Supervised VAE for Learning Disentangled Representations

no code implementations12 Jun 2020 Weili Nie, Zichao Wang, Ankit B. Patel, Richard G. Baraniuk

Learning interpretable and disentangled representations is a crucial yet challenging task in representation learning.


Towards a Better Understanding and Regularization of GAN Training Dynamics

1 code implementation24 Jun 2018 Weili Nie, Ankit Patel

Generative adversarial networks (GANs) are notoriously difficult to train and the reasons underlying their (non-)convergence behaviors are still not completely understood.

A Theoretical Explanation for Perplexing Behaviors of Backpropagation-based Visualizations

1 code implementation ICML 2018 Weili Nie, Yang Zhang, Ankit Patel

Backpropagation-based visualizations have been proposed to interpret convolutional neural networks (CNNs), however a theory is missing to justify their behaviors: Guided backpropagation (GBP) and deconvolutional network (DeconvNet) generate more human-interpretable but less class-sensitive visualizations than saliency map.

