Search Results for author: Arash Vahdat

Found 51 papers, 29 papers with code

AGG: Amortized Generative 3D Gaussians for Single Image to 3D

no code implementations8 Jan 2024 Dejia Xu, Ye Yuan, Morteza Mardani, Sifei Liu, Jiaming Song, Zhangyang Wang, Arash Vahdat

To overcome these challenges, we introduce an Amortized Generative 3D Gaussian framework (AGG) that instantly produces 3D Gaussians from a single image, eliminating the need for per-instance optimization.

3D Reconstruction Image to 3D +1

DiffiT: Diffusion Vision Transformers for Image Generation

1 code implementation4 Dec 2023 Ali Hatamizadeh, Jiaming Song, Guilin Liu, Jan Kautz, Arash Vahdat

We also introduce latent DiffiT which consists of transformer model with the proposed self-attention layers, for high-resolution image generation.

Denoising Image Generation

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

no code implementations6 Oct 2023 Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri, Rao Kotamarthi, Venkatram Vishwanath, Arvind Ramanathan, Sam Foreman, Kyle Hippe, Troy Arcomano, Romit Maulik, Maxim Zvyagin, Alexander Brace, Bin Zhang, Cindy Orozco Bohorquez, Austin Clyde, Bharat Kale, Danilo Perez-Rivera, Heng Ma, Carla M. Mann, Michael Irvin, J. Gregory Pauloski, Logan Ward, Valerie Hayot, Murali Emani, Zhen Xie, Diangen Lin, Maulik Shukla, Ian Foster, James J. Davis, Michael E. Papka, Thomas Brettin, Prasanna Balaprakash, Gina Tourassi, John Gounley, Heidi Hanson, Thomas E Potok, Massimiliano Lupo Pasini, Kate Evans, Dan Lu, Dalton Lunga, Junqi Yin, Sajal Dash, Feiyi Wang, Mallikarjun Shankar, Isaac Lyngaas, Xiao Wang, Guojing Cong, Pei Zhang, Ming Fan, Siyan Liu, Adolfy Hoisie, Shinjae Yoo, Yihui Ren, William Tang, Kyle Felker, Alexey Svyatkovskiy, Hang Liu, Ashwin Aji, Angela Dalton, Michael Schulte, Karl Schulz, Yuntian Deng, Weili Nie, Josh Romero, Christian Dallago, Arash Vahdat, Chaowei Xiao, Thomas Gibbs, Anima Anandkumar, Rick Stevens

In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences.

Fast Training of Diffusion Models with Masked Transformers

1 code implementation15 Jun 2023 Hongkai Zheng, Weili Nie, Arash Vahdat, Anima Anandkumar

For masked training, we introduce an asymmetric encoder-decoder architecture consisting of a transformer encoder that operates only on unmasked patches and a lightweight transformer decoder on full patches.

Denoising Representation Learning

A Variational Perspective on Solving Inverse Problems with Diffusion Models

1 code implementation7 May 2023 Morteza Mardani, Jiaming Song, Jan Kautz, Arash Vahdat

To cope with this challenge, we propose a variational approach that by design seeks to approximate the true posterior distribution.

Denoising Image Restoration +1

Recurrence without Recurrence: Stable Video Landmark Detection with Deep Equilibrium Models

1 code implementation CVPR 2023 Paul Micaelli, Arash Vahdat, Hongxu Yin, Jan Kautz, Pavlo Molchanov

Our Landmark DEQ (LDEQ) achieves state-of-the-art performance on the challenging WFLW facial landmark dataset, reaching $3. 92$ NME with fewer parameters and a training memory cost of $\mathcal{O}(1)$ in the number of recurrent modules.

Face Alignment

Score-based Diffusion Models in Function Space

no code implementations14 Feb 2023 Jae Hyun Lim, Nikola B. Kovachki, Ricardo Baptista, Christopher Beckham, Kamyar Azizzadenesheli, Jean Kossaifi, Vikram Voleti, Jiaming Song, Karsten Kreis, Jan Kautz, Christopher Pal, Arash Vahdat, Anima Anandkumar

They consist of a forward process that perturbs input data with Gaussian white noise and a reverse process that learns a score function to generate samples by denoising.

Denoising

I$^2$SB: Image-to-Image Schrödinger Bridge

1 code implementation12 Feb 2023 Guan-Horng Liu, Arash Vahdat, De-An Huang, Evangelos A. Theodorou, Weili Nie, Anima Anandkumar

We propose Image-to-Image Schr\"odinger Bridge (I$^2$SB), a new class of conditional diffusion models that directly learn the nonlinear diffusion processes between two given distributions.

Deblurring Image Restoration +1

PhysDiff: Physics-Guided Human Motion Diffusion Model

no code implementations ICCV 2023 Ye Yuan, Jiaming Song, Umar Iqbal, Arash Vahdat, Jan Kautz

Specifically, we propose a physics-based motion projection module that uses motion imitation in a physics simulator to project the denoised motion of a diffusion step to a physically-plausible motion.

Denoising

Differentially Private Diffusion Models

1 code implementation18 Oct 2022 Tim Dockhorn, Tianshi Cao, Arash Vahdat, Karsten Kreis

While modern machine learning models rely on increasingly large training datasets, data is often limited in privacy-sensitive domains.

Image Generation

LION: Latent Point Diffusion Models for 3D Shape Generation

2 code implementations12 Oct 2022 Xiaohui Zeng, Arash Vahdat, Francis Williams, Zan Gojcic, Or Litany, Sanja Fidler, Karsten Kreis

To advance 3D DDMs and make them useful for digital artists, we require (i) high generation quality, (ii) flexibility for manipulation and applications such as conditional synthesis and shape interpolation, and (iii) the ability to output smooth surfaces or meshes.

3D Shape Generation Denoising +2

GENIE: Higher-Order Denoising Diffusion Solvers

1 code implementation11 Oct 2022 Tim Dockhorn, Arash Vahdat, Karsten Kreis

Synthesis amounts to solving a differential equation (DE) defined by the learnt model.

Denoising Image Generation

Diffusion Models for Adversarial Purification

2 code implementations16 May 2022 Weili Nie, Brandon Guo, Yujia Huang, Chaowei Xiao, Arash Vahdat, Anima Anandkumar

Adversarial purification refers to a class of defense methods that remove adversarial perturbations using a generative model.

Tackling the Generative Learning Trilemma with Denoising Diffusion GANs

5 code implementations ICLR 2022 Zhisheng Xiao, Karsten Kreis, Arash Vahdat

To the best of our knowledge, denoising diffusion GAN is the first model that reduces sampling cost in diffusion models to an extent that allows them to be applied to real-world applications inexpensively.

Image Generation

Score-Based Generative Modeling with Critically-Damped Langevin Diffusion

1 code implementation ICLR 2022 Tim Dockhorn, Arash Vahdat, Karsten Kreis

SGMs rely on a diffusion process that gradually perturbs the data towards a tractable distribution, while the generative model learns to denoise.

Image Generation

AdaViT: Adaptive Tokens for Efficient Vision Transformer

1 code implementation CVPR 2022 Hongxu Yin, Arash Vahdat, Jose Alvarez, Arun Mallya, Jan Kautz, Pavlo Molchanov

A-ViT achieves this by automatically reducing the number of tokens in vision transformers that are processed in the network as inference proceeds.

Image Classification

Don’t Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence

no code implementations NeurIPS 2021 Tianshi Cao, Alex Bie, Arash Vahdat, Sanja Fidler, Karsten Kreis

Generative models trained with privacy constraints on private data can sidestep this challenge, providing indirect access to private data instead.

Don't Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence

1 code implementation1 Nov 2021 Tianshi Cao, Alex Bie, Arash Vahdat, Sanja Fidler, Karsten Kreis

Generative models trained with privacy constraints on private data can sidestep this challenge, providing indirect access to private data instead.

Hardware-Aware Network Transformation

no code implementations29 Sep 2021 Pavlo Molchanov, Jimmy Hall, Hongxu Yin, Jan Kautz, Nicolo Fusi, Arash Vahdat

In the second phase, it solves the combinatorial selection of efficient operations using a novel constrained integer linear optimization approach.

Neural Architecture Search

LANA: Latency Aware Network Acceleration

no code implementations12 Jul 2021 Pavlo Molchanov, Jimmy Hall, Hongxu Yin, Jan Kautz, Nicolo Fusi, Arash Vahdat

We analyze three popular network architectures: EfficientNetV1, EfficientNetV2 and ResNeST, and achieve accuracy improvement for all models (up to $3. 0\%$) when compressing larger models to the latency level of smaller models.

Neural Architecture Search Quantization

Score-based Generative Modeling in Latent Space

1 code implementation NeurIPS 2021 Arash Vahdat, Karsten Kreis, Jan Kautz

Moving from data to latent space allows us to train more expressive generative models, apply SGMs to non-continuous data, and learn smoother SGMs in a smaller space, resulting in fewer network evaluations and faster sampling.

Ranked #2 on Image Generation on CIFAR-10 (FD metric)

Image Generation

Shifting Transformation Learning for Out-of-Distribution Detection

no code implementations7 Jun 2021 Sina Mohseni, Arash Vahdat, Jay Yadawa

In this paper, we propose a simple framework that leverages a shifting transformation learning setting for learning multiple shifted representations of the training set for improved OOD detection.

Anomaly Detection Contrastive Learning +3

See through Gradients: Image Batch Recovery via GradInversion

2 code implementations CVPR 2021 Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov

In this work, we introduce GradInversion, using which input images from a larger batch (8 - 48 images) can also be recovered for large networks such as ResNets (50 layers), on complex datasets such as ImageNet (1000 classes, 224x224 px).

Federated Learning Inference Attack +1

A Contrastive Learning Approach for Training Variational Autoencoder Priors

no code implementations NeurIPS 2021 Jyoti Aneja, Alexander Schwing, Jan Kautz, Arash Vahdat

To tackle this issue, we propose an energy-based prior defined by the product of a base prior distribution and a reweighting factor, designed to bring the base closer to the aggregate posterior.

Ranked #6 on Image Generation on CelebA 256x256 (FID metric)

Contrastive Learning Image Generation

VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models

1 code implementation ICLR 2021 Zhisheng Xiao, Karsten Kreis, Jan Kautz, Arash Vahdat

VAEBM captures the overall mode structure of the data distribution using a state-of-the-art VAE and it relies on its EBM component to explicitly exclude non-data-like regions from the model and refine the image samples.

Image Generation Out-of-Distribution Detection

NCP-VAE: Variational Autoencoders with Noise Contrastive Priors

no code implementations28 Sep 2020 Jyoti Aneja, Alex Schwing, Jan Kautz, Arash Vahdat

To tackle this issue, we propose an energy-based prior defined by the product of a base prior distribution and a reweighting factor, designed to bring the base closer to the aggregate posterior.

NVAE: A Deep Hierarchical Variational Autoencoder

8 code implementations NeurIPS 2020 Arash Vahdat, Jan Kautz

For example, on CIFAR-10, NVAE pushes the state-of-the-art from 2. 98 to 2. 91 bits per dimension, and it produces high-quality images on CelebA HQ.

Ranked #3 on Image Generation on FFHQ 256 x 256 (bits/dimension metric)

Image Generation

Contrastive Learning for Weakly Supervised Phrase Grounding

1 code implementation ECCV 2020 Tanmay Gupta, Arash Vahdat, Gal Chechik, Xiaodong Yang, Jan Kautz, Derek Hoiem

Given pairs of images and captions, we maximize compatibility of the attention-weighted regions and the words in the corresponding caption, compared to non-corresponding pairs of images and captions.

Contrastive Learning Language Modelling +1

On the distance between two neural networks and the stability of learning

2 code implementations NeurIPS 2020 Jeremy Bernstein, Arash Vahdat, Yisong Yue, Ming-Yu Liu

This paper relates parameter distance to gradient breakdown for a broad class of nonlinear compositional functions.

LEMMA

UNAS: Differentiable Architecture Search Meets Reinforcement Learning

1 code implementation CVPR 2020 Arash Vahdat, Arun Mallya, Ming-Yu Liu, Jan Kautz

Our framework brings the best of both worlds, and it enables us to search for architectures with both differentiable and non-differentiable criteria in one unified framework while maintaining a low search cost.

Neural Architecture Search reinforcement-learning +1

A Robust Learning Approach to Domain Adaptive Object Detection

1 code implementation ICCV 2019 Mehran Khodabandeh, Arash Vahdat, Mani Ranjbar, William G. Macready

To adapt to the domain shift, the model is trained on the target domain using a set of noisy object bounding boxes that are obtained by a detection model trained only in the source domain.

Domain Adaptation Object +3

Improved Gradient-Based Optimization Over Discrete Distributions

no code implementations29 Sep 2018 Evgeny Andriyash, Arash Vahdat, Bill Macready

In many applications we seek to maximize an expectation with respect to a distribution over discrete variables.

Variational Inference

Improved Gradient Estimators for Stochastic Discrete Variables

no code implementations27 Sep 2018 Evgeny Andriyash, Arash Vahdat, Bill Macready

In many applications we seek to optimize an expectation with respect to a distribution over discrete variables.

Variational Inference

DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors

no code implementations NeurIPS 2018 Arash Vahdat, Evgeny Andriyash, William G. Macready

Experiments on the MNIST and OMNIGLOT datasets show that these relaxations outperform previous discrete VAEs with Boltzmann priors.

DVAE++: Discrete Variational Autoencoders with Overlapping Transformations

no code implementations ICML 2018 Arash Vahdat, William G. Macready, Zhengbing Bian, Amir Khoshaman, Evgeny Andriyash

Training of discrete latent variable models remains challenging because passing gradient information through discrete units is difficult.

Ranked #53 on Image Generation on CIFAR-10 (bits/dimension metric)

Image Generation

Toward Robustness against Label Noise in Training Deep Discriminative Neural Networks

1 code implementation NeurIPS 2017 Arash Vahdat

Collecting large training datasets, annotated with high-quality labels, is costly and time-consuming.

Hierarchical Deep Temporal Models for Group Activity Recognition

1 code implementation9 Jul 2016 Mostafa S. Ibrahim, Srikanth Muralidharan, Zhiwei Deng, Arash Vahdat, Greg Mori

In order to model both person-level and group-level dynamics, we present a 2-stage deep temporal model for the group activity recognition problem.

Group Activity Recognition

A Hierarchical Deep Temporal Model for Group Activity Recognition

1 code implementation CVPR 2016 Moustafa Ibrahim, Srikanth Muralidharan, Zhiwei Deng, Arash Vahdat, Greg Mori

In group activity recognition, the temporal dynamics of the whole activity can be inferred based on the dynamics of the individual people representing the activity.

Group Activity Recognition

Kernel Latent SVM for Visual Recognition

no code implementations NeurIPS 2012 Weilong Yang, Yang Wang, Arash Vahdat, Greg Mori

Latent SVMs (LSVMs) are a class of powerful tools that have been successfully applied to many applications in computer vision.

Cannot find the paper you are looking for? You can Submit a new open access paper.