no code implementations • ICCV 2023 • Siddharth Gururani, Arun Mallya, Ting-Chun Wang, Rafael Valle, Ming-Yu Liu
It uses a multi-stage approach, combining the controllability of facial landmarks with the high-quality synthesis power of a pretrained face generator.
no code implementations • 4 Oct 2022 • Arun Mallya, Ting-Chun Wang, Ming-Yu Liu
We present a new implicit warping framework for image animation using sets of source images through the transfer of the motion of a driving video.
1 code implementation • CVPR 2022 • Hongxu Yin, Arash Vahdat, Jose Alvarez, Arun Mallya, Jan Kautz, Pavlo Molchanov
A-ViT achieves this by automatically reducing the number of tokens in vision transformers that are processed in the network as inference proceeds.
Ranked #34 on Efficient ViTs on ImageNet-1K (with DeiT-S)
no code implementations • 9 Dec 2021 • Xun Huang, Arun Mallya, Ting-Chun Wang, Ming-Yu Liu
Existing conditional image synthesis frameworks generate images based on user inputs in a single modality, such as text, segmentation, sketch, or style reference.
1 code implementation • CVPR 2021 • Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov
In this work, we introduce GradInversion, using which input images from a larger batch (8 - 48 images) can also be recovered for large networks such as ResNets (50 layers), on complex datasets such as ImageNet (1000 classes, 224x224 px).
no code implementations • ICCV 2021 • Zekun Hao, Arun Mallya, Serge Belongie, Ming-Yu Liu
We represent the world as a continuous volumetric function and train our model to render view-consistent photorealistic images for a user-controlled camera.
2 code implementations • CVPR 2021 • Ting-Chun Wang, Arun Mallya, Ming-Yu Liu
We propose a neural talking-head video synthesis model and demonstrate its application to video conferencing.
no code implementations • 6 Aug 2020 • Ming-Yu Liu, Xun Huang, Jiahui Yu, Ting-Chun Wang, Arun Mallya
The generative adversarial network (GAN) framework has emerged as a powerful tool for various image and video synthesis tasks, allowing the synthesis of visual content in an unconditional or input-conditional manner.
no code implementations • ECCV 2020 • Arun Mallya, Ting-Chun Wang, Karan Sapra, Ming-Yu Liu
This is because they lack knowledge of the 3D world being rendered and generate each frame only based on the past few frames.
2 code implementations • CVPR 2020 • Hongxu Yin, Pavlo Molchanov, Zhizhong Li, Jose M. Alvarez, Arun Mallya, Derek Hoiem, Niraj K. Jha, Jan Kautz
We introduce DeepInversion, a new method for synthesizing images from the image distribution used to train a deep neural network.
1 code implementation • CVPR 2020 • Arash Vahdat, Arun Mallya, Ming-Yu Liu, Jan Kautz
Our framework brings the best of both worlds, and it enables us to search for architectures with both differentiable and non-differentiable criteria in one unified framework while maintaining a low search cost.
3 code implementations • CVPR 2019 • Pavlo Molchanov, Arun Mallya, Stephen Tyree, Iuri Frosio, Jan Kautz
On ResNet-101, we achieve a 40% FLOPS reduction by removing 30% of the parameters, with a loss of 0. 02% in the top-1 accuracy on ImageNet.
no code implementations • 28 May 2019 • Zih-Siou Hung, Arun Mallya, Svetlana Lazebnik
The previous VTransE model maps entities and predicates into a low-dimensional embedding vector space where the predicate is interpreted as a translation vector between the embedded features of the bounding box regions of the subject and the object.
10 code implementations • ICCV 2019 • Ming-Yu Liu, Xun Huang, Arun Mallya, Tero Karras, Timo Aila, Jaakko Lehtinen, Jan Kautz
Unsupervised image-to-image translation methods learn to map images in a given class to an analogous image in a different class, drawing on unstructured (non-registered) datasets of images.
1 code implementation • ECCV 2018 • Arun Mallya, Dillon Davis, Svetlana Lazebnik
This work presents a method for adapting a single, fixed deep neural network to multiple tasks without affecting performance on already learned tasks.
4 code implementations • CVPR 2018 • Arun Mallya, Svetlana Lazebnik
This paper presents a method for adding multiple tasks to a single deep neural network while avoiding catastrophic forgetting.
no code implementations • ICCV 2017 • Arun Mallya, Svetlana Lazebnik
This work proposes Recurrent Neural Network (RNN) models to predict structured 'image situations' -- actions and noun entities fulfilling semantic roles related to the action.
Ranked #10 on Grounded Situation Recognition on SWiG
Grounded Situation Recognition Human-Object Interaction Detection +1
1 code implementation • ICCV 2017 • Bryan A. Plummer, Arun Mallya, Christopher M. Cervantes, Julia Hockenmaier, Svetlana Lazebnik
This paper presents a framework for localization or grounding of phrases in images using a large collection of linguistic and visual cues.
no code implementations • 1 Nov 2016 • Tatiana Tommasi, Arun Mallya, Bryan Plummer, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg
This paper presents an approach for answering fill-in-the-blank multiple choice questions from the Visual Madlibs dataset.
no code implementations • 11 Aug 2016 • Tatiana Tommasi, Arun Mallya, Bryan Plummer, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg
This paper focuses on answering fill-in-the-blank style multiple choice questions from the Visual Madlibs dataset.
no code implementations • 16 Apr 2016 • Arun Mallya, Svetlana Lazebnik
This paper proposes deep convolutional network models that utilize local and global context to make human activity label predictions in still images, achieving state-of-the-art performance on two recent datasets with hundreds of labels each.
Ranked #6 on Human-Object Interaction Detection on HICO
General Classification Human-Object Interaction Detection +4
no code implementations • ICCV 2015 • Arun Mallya, Svetlana Lazebnik
We learn to predict 'informative edge' probability maps using two recent methods that exploit local and global context, respectively: structured edge detection forests, and a fully convolutional network for pixelwise labeling.
no code implementations • 22 Jul 2015 • Kevin J. Shih, Arun Mallya, Saurabh Singh, Derek Hoiem
We present a simple deep learning framework to simultaneously predict keypoint locations and their respective visibilities and use those to achieve state-of-the-art performance for fine-grained classification.
no code implementations • 19 Feb 2015 • Ming-Yu Liu, Arun Mallya, Oncel C. Tuzel, Xi Chen
Our idea is to pretrain the network through the task of replicating the process of hand-designed feature extraction.