Search Results for author: Arash Vahdat

Found 51 papers, 30 papers with code

NVAE: A Deep Hierarchical Variational Autoencoder

8 code implementations • NeurIPS 2020 • Arash Vahdat, Jan Kautz

For example, on CIFAR-10, NVAE pushes the state-of-the-art from 2. 98 to 2. 91 bits per dimension, and it produces high-quality images on CelebA HQ.

Ranked #3 on Image Generation on FFHQ 256 x 256 (bits/dimension metric)

Image Generation

973

Paper
Code

Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models

1 code implementation • CVPR 2023 • Jiarui Xu, Sifei Liu, Arash Vahdat, Wonmin Byeon, Xiaolong Wang, Shalini De Mello

Our approach outperforms the previous state of the art by significant margins on both open-vocabulary panoptic and semantic segmentation tasks.

Ranked #2 on Open-World Instance Segmentation on UVO (using extra training data)

Open Vocabulary Panoptic Segmentation Open Vocabulary Semantic Segmentation +4

799

Paper
Code

LION: Latent Point Diffusion Models for 3D Shape Generation

2 code implementations • 12 Oct 2022 • Xiaohui Zeng, Arash Vahdat, Francis Williams, Zan Gojcic, Or Litany, Sanja Fidler, Karsten Kreis

To advance 3D DDMs and make them useful for digital artists, we require (i) high generation quality, (ii) flexibility for manipulation and applications such as conditional synthesis and shape interpolation, and (iii) the ability to output smooth surfaces or meshes.

Ranked #1 on Point Cloud Generation on ShapeNet Airplane

3D Generation 3D Shape Generation +3

703

Paper
Code

Tackling the Generative Learning Trilemma with Denoising Diffusion GANs

5 code implementations • ICLR 2022 • Zhisheng Xiao, Karsten Kreis, Arash Vahdat

To the best of our knowledge, denoising diffusion GAN is the first model that reduces sampling cost in diffusion models to an extent that allows them to be applied to real-world applications inexpensively.

Ranked #9 on Image Generation on CelebA-HQ 256x256

Image Generation

648

Paper
Code

eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers

2 code implementations • 2 Nov 2022 • Yogesh Balaji, Seungjun Nah, Xun Huang, Arash Vahdat, Jiaming Song, Qinsheng Zhang, Karsten Kreis, Miika Aittala, Timo Aila, Samuli Laine, Bryan Catanzaro, Tero Karras, Ming-Yu Liu

Therefore, in contrast to existing works, we propose to train an ensemble of text-to-image diffusion models specialized for different synthesis stages.

Ranked #14 on Text-to-Image Generation on MS COCO

Text-to-Image Generation

627

Paper
Code

Score-based Generative Modeling in Latent Space

1 code implementation • NeurIPS 2021 • Arash Vahdat, Karsten Kreis, Jan Kautz

Moving from data to latent space allows us to train more expressive generative models, apply SGMs to non-continuous data, and learn smoother SGMs in a smaller space, resulting in fewer network evaluations and faster sampling.

Ranked #3 on Image Generation on CIFAR-10 (FD metric)

Image Generation

336

Paper
Code

See through Gradients: Image Batch Recovery via GradInversion

2 code implementations • CVPR 2021 • Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov

In this work, we introduce GradInversion, using which input images from a larger batch (8 - 48 images) can also be recovered for large networks such as ResNets (50 layers), on complex datasets such as ImageNet (1000 classes, 224x224 px).

Federated Learning Inference Attack +1

323

Paper
Code

DiffiT: Diffusion Vision Transformers for Image Generation

1 code implementation • 4 Dec 2023 • Ali Hatamizadeh, Jiaming Song, Guilin Liu, Jan Kautz, Arash Vahdat

In this paper, we study the effectiveness of ViTs in diffusion-based generative learning and propose a new model denoted as Diffusion Vision Transformers (DiffiT).

Ranked #4 on Image Generation on ImageNet 256x256

Denoising Image Generation

318

Paper
Code

Fast Training of Diffusion Models with Masked Transformers

1 code implementation • 15 Jun 2023 • Hongkai Zheng, Weili Nie, Arash Vahdat, Anima Anandkumar

For masked training, we introduce an asymmetric encoder-decoder architecture consisting of a transformer encoder that operates only on unmasked patches and a lightweight transformer decoder on full patches.

Denoising Representation Learning

281

Paper
Code

Diffusion Models for Adversarial Purification

2 code implementations • 16 May 2022 • Weili Nie, Brandon Guo, Yujia Huang, Chaowei Xiao, Arash Vahdat, Anima Anandkumar

Adversarial purification refers to a class of defense methods that remove adversarial perturbations using a generative model.

221

Paper
Code

I$^2$SB: Image-to-Image Schrödinger Bridge

1 code implementation • 12 Feb 2023 • Guan-Horng Liu, Arash Vahdat, De-An Huang, Evangelos A. Theodorou, Weili Nie, Anima Anandkumar

We propose Image-to-Image Schr\"odinger Bridge (I$^2$SB), a new class of conditional diffusion models that directly learn the nonlinear diffusion processes between two given distributions.

Deblurring Image Restoration +1

207

Paper
Code

On the distance between two neural networks and the stability of learning

2 code implementations • NeurIPS 2020 • Jeremy Bernstein, Arash Vahdat, Yisong Yue, Ming-Yu Liu

This paper relates parameter distance to gradient breakdown for a broad class of nonlinear compositional functions.

LEMMA

204

Paper
Code

Score-Based Generative Modeling with Critically-Damped Langevin Diffusion

1 code implementation • ICLR 2022 • Tim Dockhorn, Arash Vahdat, Karsten Kreis

SGMs rely on a diffusion process that gradually perturbs the data towards a tractable distribution, while the generative model learns to denoise.

Ranked #24 on Image Generation on CIFAR-10

Image Generation

193

Paper
Code

Hierarchical Deep Temporal Models for Group Activity Recognition

1 code implementation • 9 Jul 2016 • Mostafa S. Ibrahim, Srikanth Muralidharan, Zhiwei Deng, Arash Vahdat, Greg Mori

In order to model both person-level and group-level dynamics, we present a 2-stage deep temporal model for the group activity recognition problem.

Group Activity Recognition

169

Paper
Code

A Hierarchical Deep Temporal Model for Group Activity Recognition

1 code implementation • CVPR 2016 • Moustafa Ibrahim, Srikanth Muralidharan, Zhiwei Deng, Arash Vahdat, Greg Mori

In group activity recognition, the temporal dynamics of the whole activity can be inferred based on the dynamics of the individual people representing the activity.

Group Activity Recognition

169

Paper
Code

State-specific protein-ligand complex structure prediction with a multi-scale deep generative model

1 code implementation • 30 Sep 2022 • Zhuoran Qiao, Weili Nie, Arash Vahdat, Thomas F. Miller III, Anima Anandkumar

The binding complexes formed by proteins and small molecule ligands are ubiquitous and critical to life.

Benchmarking Blind Docking +3

143

Paper
Code

AdaViT: Adaptive Tokens for Efficient Vision Transformer

1 code implementation • CVPR 2022 • Hongxu Yin, Arash Vahdat, Jose Alvarez, Arun Mallya, Jan Kautz, Pavlo Molchanov

A-ViT achieves this by automatically reducing the number of tokens in vision transformers that are processed in the network as inference proceeds.

Ranked #34 on Efficient ViTs on ImageNet-1K (with DeiT-S)

Efficient ViTs Token Reduction

132

Paper
Code

GENIE: Higher-Order Denoising Diffusion Solvers

1 code implementation • 11 Oct 2022 • Tim Dockhorn, Arash Vahdat, Karsten Kreis

Synthesis amounts to solving a differential equation (DE) defined by the learnt model.

Ranked #5 on Image Generation on AFHQV2

Denoising Image Generation

Paper
Code

Contrastive Learning for Weakly Supervised Phrase Grounding

1 code implementation • ECCV 2020 • Tanmay Gupta, Arash Vahdat, Gal Chechik, Xiaodong Yang, Jan Kautz, Derek Hoiem

Given pairs of images and captions, we maximize compatibility of the attention-weighted regions and the words in the corresponding caption, compared to non-corresponding pairs of images and captions.

Contrastive Learning Language Modelling +1

Paper
Code

Controllable and Compositional Generation with Latent-Space Energy-Based Models

1 code implementation • NeurIPS 2021 • Weili Nie, Arash Vahdat, Anima Anandkumar

In compositional generation, our method excels at zero-shot generation of unseen attribute combinations.

Attribute Image Generation

Paper
Code

Differentially Private Diffusion Models

1 code implementation • 18 Oct 2022 • Tim Dockhorn, Tianshi Cao, Arash Vahdat, Karsten Kreis

While modern machine learning models rely on increasingly large training datasets, data is often limited in privacy-sensitive domains.

Image Generation

Paper
Code

UNAS: Differentiable Architecture Search Meets Reinforcement Learning

1 code implementation • CVPR 2020 • Arash Vahdat, Arun Mallya, Ming-Yu Liu, Jan Kautz

Our framework brings the best of both worlds, and it enables us to search for architectures with both differentiable and non-differentiable criteria in one unified framework while maintaining a low search cost.

Neural Architecture Search reinforcement-learning +1

Paper
Code

VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models

1 code implementation • ICLR 2021 • Zhisheng Xiao, Karsten Kreis, Jan Kautz, Arash Vahdat

VAEBM captures the overall mode structure of the data distribution using a state-of-the-art VAE and it relies on its EBM component to explicitly exclude non-data-like regions from the model and refine the image samples.

Ranked #1 on Image Generation on Stacked MNIST

Image Generation Out-of-Distribution Detection

Paper
Code

A Variational Perspective on Solving Inverse Problems with Diffusion Models

1 code implementation • 7 May 2023 • Morteza Mardani, Jiaming Song, Jan Kautz, Arash Vahdat

To cope with this challenge, we propose a variational approach that by design seeks to approximate the true posterior distribution.

Denoising Image Restoration +1

Paper
Code

Recurrence without Recurrence: Stable Video Landmark Detection with Deep Equilibrium Models

1 code implementation • CVPR 2023 • Paul Micaelli, Arash Vahdat, Hongxu Yin, Jan Kautz, Pavlo Molchanov

Our Landmark DEQ (LDEQ) achieves state-of-the-art performance on the challenging WFLW facial landmark dataset, reaching $3. 92$ NME with fewer parameters and a training memory cost of $\mathcal{O}(1)$ in the number of recurrent modules.

Ranked #2 on Face Alignment on WFLW

Face Alignment

Paper
Code

Toward Robustness against Label Noise in Training Deep Discriminative Neural Networks

1 code implementation • NeurIPS 2017 • Arash Vahdat

Collecting large training datasets, annotated with high-quality labels, is costly and time-consuming.

Paper
Code

A Robust Learning Approach to Domain Adaptive Object Detection

1 code implementation • ICCV 2019 • Mehran Khodabandeh, Arash Vahdat, Mani Ranjbar, William G. Macready

To adapt to the domain shift, the model is trained on the target domain using a set of noisy object bounding boxes that are obtained by a detection model trained only in the source domain.

Domain Adaptation Object +3

Paper
Code

Fast Sampling of Diffusion Models via Operator Learning

1 code implementation • 24 Nov 2022 • Hongkai Zheng, Weili Nie, Arash Vahdat, Kamyar Azizzadenesheli, Anima Anandkumar

Diffusion models have found widespread adoption in various areas.

Operator learning

Paper
Code

Don't Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence

1 code implementation • 1 Nov 2021 • Tianshi Cao, Alex Bie, Arash Vahdat, Sanja Fidler, Karsten Kreis

Generative models trained with privacy constraints on private data can sidestep this challenge, providing indirect access to private data instead.

Paper
Code

Undirected Graphical Models as Approximate Posteriors

3 code implementations • ICML 2020 • Arash Vahdat, Evgeny Andriyash, William G. Macready

We extend the class of posterior models that may be learned by using undirected graphical models.

Bayesian Inference

Paper
Code

DVAE++: Discrete Variational Autoencoders with Overlapping Transformations

no code implementations • ICML 2018 • Arash Vahdat, William G. Macready, Zhengbing Bian, Amir Khoshaman, Evgeny Andriyash

Training of discrete latent variable models remains challenging because passing gradient information through discrete units is difficult.

Ranked #53 on Image Generation on CIFAR-10 (bits/dimension metric)

Image Generation

Paper
Add Code

DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors

no code implementations • NeurIPS 2018 • Arash Vahdat, Evgeny Andriyash, William G. Macready

Experiments on the MNIST and OMNIGLOT datasets show that these relaxations outperform previous discrete VAEs with Boltzmann priors.

Paper
Add Code

Structure Inference Machines: Recurrent Neural Networks for Analyzing Relations in Group Activity Recognition

no code implementations • CVPR 2016 • Zhiwei Deng, Arash Vahdat, Hexiang Hu, Greg Mori

As a concrete example, group activity recognition involves the interactions and relative spatial relations of a set of people in a scene.

Ranked #5 on Group Activity Recognition on Collective Activity

Group Activity Recognition

Paper
Add Code

Visual Recognition by Counting Instances: A Multi-Instance Cardinality Potential Kernel

no code implementations • CVPR 2015 • Hossein Hajimirsadeghi, Wang Yan, Arash Vahdat, Greg Mori

Many visual recognition problems can be approached by counting instances.

Event Detection Human Activity Recognition +1

Paper
Add Code

Discovering Human Interactions in Videos with Limited Data Labeling

no code implementations • 12 Feb 2015 • Mehran Khodabandeh, Arash Vahdat, Guang-Tong Zhou, Hossein Hajimirsadeghi, Mehrsan Javan Roshtkhari, Greg Mori, Stephen Se

We present a novel approach for discovering human interactions in videos.

Clustering Human-Object Interaction Detection

Paper
Add Code

Improved Gradient-Based Optimization Over Discrete Distributions

no code implementations • 29 Sep 2018 • Evgeny Andriyash, Arash Vahdat, Bill Macready

In many applications we seek to maximize an expectation with respect to a distribution over discrete variables.

Variational Inference

Paper
Add Code

Semi-Supervised Semantic Image Segmentation with Self-correcting Networks

no code implementations • CVPR 2020 • Mostafa S. Ibrahim, Arash Vahdat, Mani Ranjbar, William G. Macready

Building a large image dataset with high-quality object masks for semantic segmentation is costly and time consuming.

Image Segmentation Segmentation +1

Paper
Add Code

Latent Maximum Margin Clustering

no code implementations • NeurIPS 2013 • Guang-Tong Zhou, Tian Lan, Arash Vahdat, Greg Mori

We present a maximum margin framework that clusters data using latent variables.

Clustering TAG

Paper
Add Code

Kernel Latent SVM for Visual Recognition

no code implementations • NeurIPS 2012 • Weilong Yang, Yang Wang, Arash Vahdat, Greg Mori

Latent SVMs (LSVMs) are a class of powerful tools that have been successfully applied to many applications in computer vision.

Paper
Add Code

A Contrastive Learning Approach for Training Variational Autoencoder Priors

no code implementations • NeurIPS 2021 • Jyoti Aneja, Alexander Schwing, Jan Kautz, Arash Vahdat

To tackle this issue, we propose an energy-based prior defined by the product of a base prior distribution and a reweighting factor, designed to bring the base closer to the aggregate posterior.

Ranked #6 on Image Generation on CelebA 256x256 (FID metric)

Contrastive Learning Image Generation

Paper
Add Code

Shifting Transformation Learning for Out-of-Distribution Detection

no code implementations • 7 Jun 2021 • Sina Mohseni, Arash Vahdat, Jay Yadawa

In this paper, we propose a simple framework that leverages a shifting transformation learning setting for learning multiple shifted representations of the training set for improved OOD detection.

Ranked #9 on Anomaly Detection on Unlabeled CIFAR-10 vs CIFAR-100

Anomaly Detection Contrastive Learning +3

Paper
Add Code

LANA: Latency Aware Network Acceleration

no code implementations • 12 Jul 2021 • Pavlo Molchanov, Jimmy Hall, Hongxu Yin, Jan Kautz, Nicolo Fusi, Arash Vahdat

We analyze three popular network architectures: EfficientNetV1, EfficientNetV2 and ResNeST, and achieve accuracy improvement for all models (up to $3. 0\%$) when compressing larger models to the latency level of smaller models.

Neural Architecture Search Quantization

Paper
Add Code

Hardware-Aware Network Transformation

no code implementations • 29 Sep 2021 • Pavlo Molchanov, Jimmy Hall, Hongxu Yin, Jan Kautz, Nicolo Fusi, Arash Vahdat

In the second phase, it solves the combinatorial selection of efficient operations using a novel constrained integer linear optimization approach.

Neural Architecture Search

Paper
Add Code

Don’t Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence

no code implementations • NeurIPS 2021 • Tianshi Cao, Alex Bie, Arash Vahdat, Sanja Fidler, Karsten Kreis

Generative models trained with privacy constraints on private data can sidestep this challenge, providing indirect access to private data instead.

Paper
Add Code

Improved Gradient Estimators for Stochastic Discrete Variables

no code implementations • 27 Sep 2018 • Evgeny Andriyash, Arash Vahdat, Bill Macready

In many applications we seek to optimize an expectation with respect to a distribution over discrete variables.

Variational Inference

Paper
Add Code

NCP-VAE: Variational Autoencoders with Noise Contrastive Priors

no code implementations • 28 Sep 2020 • Jyoti Aneja, Alex Schwing, Jan Kautz, Arash Vahdat

To tackle this issue, we propose an energy-based prior defined by the product of a base prior distribution and a reweighting factor, designed to bring the base closer to the aggregate posterior.

Paper
Add Code

PhysDiff: Physics-Guided Human Motion Diffusion Model

no code implementations • ICCV 2023 • Ye Yuan, Jiaming Song, Umar Iqbal, Arash Vahdat, Jan Kautz

Specifically, we propose a physics-based motion projection module that uses motion imitation in a physics simulator to project the denoised motion of a diffusion step to a physically-plausible motion.

Denoising

Paper
Add Code

Score-based Diffusion Models in Function Space

no code implementations • 14 Feb 2023 • Jae Hyun Lim, Nikola B. Kovachki, Ricardo Baptista, Christopher Beckham, Kamyar Azizzadenesheli, Jean Kossaifi, Vikram Voleti, Jiaming Song, Karsten Kreis, Jan Kautz, Christopher Pal, Arash Vahdat, Anima Anandkumar

They consist of a forward process that perturbs input data with Gaussian white noise and a reverse process that learns a score function to generate samples by denoising.

Denoising

Paper
Add Code

Residual Diffusion Modeling for Km-scale Atmospheric Downscaling

no code implementations • 24 Sep 2023 • Morteza Mardani, Noah Brenowitz, Yair Cohen, Jaideep Pathak, Chieh-Yu Chen, Cheng-Chin Liu, Arash Vahdat, Karthik Kashinath, Jan Kautz, Mike Pritchard

Predictions of weather hazard require expensive km-scale simulations driven by coarser global inputs.

Paper
Add Code

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

no code implementations • 6 Oct 2023 • Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri, Rao Kotamarthi, Venkatram Vishwanath, Arvind Ramanathan, Sam Foreman, Kyle Hippe, Troy Arcomano, Romit Maulik, Maxim Zvyagin, Alexander Brace, Bin Zhang, Cindy Orozco Bohorquez, Austin Clyde, Bharat Kale, Danilo Perez-Rivera, Heng Ma, Carla M. Mann, Michael Irvin, J. Gregory Pauloski, Logan Ward, Valerie Hayot, Murali Emani, Zhen Xie, Diangen Lin, Maulik Shukla, Ian Foster, James J. Davis, Michael E. Papka, Thomas Brettin, Prasanna Balaprakash, Gina Tourassi, John Gounley, Heidi Hanson, Thomas E Potok, Massimiliano Lupo Pasini, Kate Evans, Dan Lu, Dalton Lunga, Junqi Yin, Sajal Dash, Feiyi Wang, Mallikarjun Shankar, Isaac Lyngaas, Xiao Wang, Guojing Cong, Pei Zhang, Ming Fan, Siyan Liu, Adolfy Hoisie, Shinjae Yoo, Yihui Ren, William Tang, Kyle Felker, Alexey Svyatkovskiy, Hang Liu, Ashwin Aji, Angela Dalton, Michael Schulte, Karl Schulz, Yuntian Deng, Weili Nie, Josh Romero, Christian Dallago, Arash Vahdat, Chaowei Xiao, Thomas Gibbs, Anima Anandkumar, Rick Stevens

In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences.

Paper
Add Code

AGG: Amortized Generative 3D Gaussians for Single Image to 3D

no code implementations • 8 Jan 2024 • Dejia Xu, Ye Yuan, Morteza Mardani, Sifei Liu, Jiaming Song, Zhangyang Wang, Arash Vahdat

To overcome these challenges, we introduce an Amortized Generative 3D Gaussian framework (AGG) that instantly produces 3D Gaussians from a single image, eliminating the need for per-instance optimization.

3D Generation 3D Reconstruction +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.