Search Results for author: Varun Jampani

Found 107 papers, 55 papers with code

The Informed Sampler: A Discriminative Approach to Bayesian Inference in Generative Computer Vision Models

1 code implementation4 Feb 2014 Varun Jampani, Sebastian Nowozin, Matthew Loper, Peter V. Gehler

Computer vision is hard because of a large variability in lighting, shape, and texture; in addition the image signal is non-additive due to occlusion.

Bayesian Inference

Consensus Message Passing for Layered Graphical Models

no code implementations27 Oct 2014 Varun Jampani, S. M. Ali Eslami, Daniel Tarlow, Pushmeet Kohli, John Winn

Generative models provide a powerful framework for probabilistic reasoning.

Permutohedral Lattice CNNs

no code implementations20 Dec 2014 Martin Kiefel, Varun Jampani, Peter V. Gehler

This paper presents a convolutional layer that is able to process sparse input features.

Position

Superpixel Convolutional Networks using Bilateral Inceptions

1 code implementation20 Nov 2015 Raghudeep Gadde, Varun Jampani, Martin Kiefel, Daniel Kappler, Peter V. Gehler

We introduce a new 'bilateral inception' module that can be inserted in existing CNN architectures and performs bilateral filtering, at multiple feature-scales, between superpixels in an image.

Image Segmentation Segmentation +2

Efficient 2D and 3D Facade Segmentation using Auto-Context

no code implementations21 Jun 2016 Raghudeep Gadde, Varun Jampani, Renaud Marlet, Peter V. Gehler

This paper introduces a fast and efficient segmentation technique for 2D images and 3D point clouds of building facades.

Segmentation

Semantic Video CNNs through Representation Warping

1 code implementation ICCV 2017 Raghudeep Gadde, Varun Jampani, Peter V. Gehler

A key insight of this work is that fast optical flow methods can be combined with many different CNN architectures for improved performance and end-to-end training.

Optical Flow Estimation Semantic Segmentation

Learning Inference Models for Computer Vision

no code implementations31 Aug 2017 Varun Jampani

We propose inference techniques for both generative and discriminative vision models.

Bayesian Inference

On the Integration of Optical Flow and Action Recognition

no code implementations22 Dec 2017 Laura Sevilla-Lara, Yiyi Liao, Fatma Guney, Varun Jampani, Andreas Geiger, Michael J. Black

Here we take a deeper look at the combination of flow and action recognition, and investigate why optical flow is helpful, what makes a flow method good for action recognition, and how we can make it better.

Action Recognition Optical Flow Estimation +1

SPLATNet: Sparse Lattice Networks for Point Cloud Processing

2 code implementations CVPR 2018 Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz

We present a network architecture for processing point clouds that directly operates on a collection of points represented as a sparse set of samples in a high-dimensional lattice.

3D Part Segmentation 3D Semantic Segmentation

Switchable Temporal Propagation Network

1 code implementation ECCV 2018 Sifei Liu, Guangyu Zhong, Shalini De Mello, Jinwei Gu, Varun Jampani, Ming-Hsuan Yang, Jan Kautz

Our approach is based on a temporal propagation network (TPN), which models the transition-related affinity between a pair of frames in a purely data-driven manner.

Video Compression

Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation

1 code implementation CVPR 2019 Anurag Ranjan, Varun Jampani, Lukas Balles, Kihwan Kim, Deqing Sun, Jonas Wulff, Michael J. Black

We address the unsupervised learning of several interconnected problems in low-level vision: single view depth prediction, camera motion estimation, optical flow, and segmentation of a video into the static scene and moving regions.

Depth Prediction Monocular Depth Estimation +3

Superpixel Sampling Networks

2 code implementations ECCV 2018 Varun Jampani, Deqing Sun, Ming-Yu Liu, Ming-Hsuan Yang, Jan Kautz

Superpixels provide an efficient low/mid-level representation of image data, which greatly reduces the number of image primitives for subsequent vision tasks.

Segmentation Superpixels

Pixel-Adaptive Convolutional Neural Networks

2 code implementations CVPR 2019 Hang Su, Varun Jampani, Deqing Sun, Orazio Gallo, Erik Learned-Miller, Jan Kautz

In addition, we also demonstrate that PAC can be used as a drop-in replacement for convolution layers in pre-trained networks, resulting in consistent performance improvements.

SCOPS: Self-Supervised Co-Part Segmentation

1 code implementation CVPR 2019 Wei-Chih Hung, Varun Jampani, Sifei Liu, Pavlo Molchanov, Ming-Hsuan Yang, Jan Kautz

Parts provide a good intermediate representation of objects that is robust with respect to the camera, pose and appearance variations.

Object Segmentation +4

Gated-SCNN: Gated Shape CNNs for Semantic Segmentation

4 code implementations ICCV 2019 Towaki Takikawa, David Acuna, Varun Jampani, Sanja Fidler

Here, we propose a new two-stream CNN architecture for semantic segmentation that explicitly wires shape information as a separate processing branch, i. e. shape stream, that processes information in parallel to the classical stream.

Image Segmentation Semantic Segmentation

Learning Propagation for Arbitrarily-structured Data

no code implementations ICCV 2019 Sifei Liu, Xueting Li, Varun Jampani, Shalini De Mello, Jan Kautz

We experiment with semantic segmentation networks, where we use our propagation module to jointly train on different data -- images, superpixels and point clouds.

Point Cloud Segmentation Segmentation +2

SENSE: a Shared Encoder Network for Scene-flow Estimation

1 code implementation ICCV 2019 Huaizu Jiang, Deqing Sun, Varun Jampani, Zhaoyang Lv, Erik Learned-Miller, Jan Kautz

We introduce a compact network for holistic scene flow estimation, called SENSE, which shares common encoder features among four closely-related tasks: optical flow estimation, disparity estimation from stereo, occlusion estimation, and semantic segmentation.

Disparity Estimation Occlusion Estimation +3

Self-supervised Single-view 3D Reconstruction via Semantic Consistency

1 code implementation ECCV 2020 Xueting Li, Sifei Liu, Kihwan Kim, Shalini De Mello, Varun Jampani, Ming-Hsuan Yang, Jan Kautz

To the best of our knowledge, we are the first to try and solve the single-view reconstruction problem without a category-specific template mesh or semantic keypoints.

3D Reconstruction Object +1

Two-shot Spatially-varying BRDF and Shape Estimation

1 code implementation CVPR 2020 Mark Boss, Varun Jampani, Kihwan Kim, Hendrik P. A. Lensch, Jan Kautz

Extensive experiments on both synthetic and real-world datasets show that our network trained on a synthetic dataset can generalize well to real-world images.

Vocal Bursts Valence Prediction

From Image Collections to Point Clouds with Self-supervised Shape and Pose Networks

1 code implementation CVPR 2020 K L Navaneet, Ansu Mathew, Shashank Kashyap, Wei-Chih Hung, Varun Jampani, R. Venkatesh Babu

We learn both 3D point cloud reconstruction and pose estimation networks in a self-supervised manner, making use of differentiable point cloud renderer to train with 2D supervision.

3D Object Reconstruction From A Single Image 3D Point Cloud Reconstruction +2

Intrinsic Autoencoders for Joint Neural Rendering and Intrinsic Image Decomposition

no code implementations29 Jun 2020 Hassan Abu Alhaija, Siva Karthik Mustikovela, Justus Thies, Varun Jampani, Matthias Nießner, Andreas Geiger, Carsten Rother

Neural rendering techniques promise efficient photo-realistic image synthesis while at the same time providing rich control over scene parameters by learning the physical image formation process.

Image-to-Image Translation Intrinsic Image Decomposition +1

Appearance Consensus Driven Self-Supervised Human Mesh Recovery

no code implementations ECCV 2020 Jogendra Nath Kundu, Mugalodi Rakesh, Varun Jampani, Rahul Mysore Venkatesh, R. Venkatesh Babu

We present a self-supervised human mesh recovery framework to infer human pose and shape from monocular images in the absence of any paired supervision.

3D Pose Estimation Human Mesh Recovery

Generative View Synthesis: From Single-view Semantics to Novel-view Images

1 code implementation NeurIPS 2020 Tewodros Habtegebrial, Varun Jampani, Orazio Gallo, Didier Stricker

We propose to push the envelope further, and introduce Generative View Synthesis (GVS), which can synthesize multiple photorealistic views of a scene given a single semantic map.

Image Generation Translation

Improving Deep Stereo Network Generalization with Geometric Priors

no code implementations25 Aug 2020 Jialiang Wang, Varun Jampani, Deqing Sun, Charles Loop, Stan Birchfield, Jan Kautz

End-to-end deep learning methods have advanced stereo vision in recent years and obtained excellent results when the training and test data are similar.

NeRD: Neural Reflectance Decomposition from Image Collections

1 code implementation ICCV 2021 Mark Boss, Raphael Braun, Varun Jampani, Jonathan T. Barron, Ce Liu, Hendrik P. A. Lensch

This problem is inherently more challenging when the illumination is not a single light source under laboratory conditions but is instead an unconstrained environmental illumination.

Depth Prediction Image Relighting +3

Infinite Nature: Perpetual View Generation of Natural Scenes from a Single Image

1 code implementation ICCV 2021 Andrew Liu, Richard Tucker, Varun Jampani, Ameesh Makadia, Noah Snavely, Angjoo Kanazawa

We introduce the problem of perpetual view generation - long-range generation of novel views corresponding to an arbitrarily long camera trajectory given a single image.

Image Generation Perpetual View Generation +1

Leveraging affinity cycle consistency to isolate factors of variation in learned representations

no code implementations1 Jan 2021 Kieran A Murphy, Varun Jampani, Srikumar Ramalingam, Ameesh Makadia

In this work, we operate in the setting where limited information is known about the data in the form of groupings, or set membership, and the task is to learn representations which isolate the factors of variation that are common across the groupings.

Pose Transfer Representation Learning

Learning ABCs: Approximate Bijective Correspondence for isolating factors of variation with weak supervision

1 code implementation CVPR 2022 Kieran A. Murphy, Varun Jampani, Srikumar Ramalingam, Ameesh Makadia

We propose a novel algorithm that utilizes a weak form of supervision where the data is partitioned into sets according to certain inactive (common) factors of variation which are invariant across elements of each set.

Data Augmentation Pose Transfer

Adaptive Prototype Learning and Allocation for Few-Shot Segmentation

2 code implementations CVPR 2021 Gen Li, Varun Jampani, Laura Sevilla-Lara, Deqing Sun, Jonghyun Kim, Joongkyu Kim

By integrating the SGC and GPA together, we propose the Adaptive Superpixel-guided Network (ASGNet), which is a lightweight model and adapts to object scale and shape variation.

Clustering Few-Shot Semantic Segmentation +1

Decoupled Dynamic Filter Networks

1 code implementation CVPR 2021 Jingkai Zhou, Varun Jampani, Zhixiong Pi, Qiong Liu, Ming-Hsuan Yang

Inspired by recent advances in attention, DDF decouples a depth-wise dynamic filter into spatial and channel dynamic filters.

Image Classification Semantic Segmentation

AutoFlow: Learning a Better Training Set for Optical Flow

1 code implementation CVPR 2021 Deqing Sun, Daniel Vlasic, Charles Herrmann, Varun Jampani, Michael Krainin, Huiwen Chang, Ramin Zabih, William T. Freeman, Ce Liu

Synthetic datasets play a critical role in pre-training CNN models for optical flow, but they are painstaking to generate and hard to adapt to new applications.

Optical Flow Estimation

Implicit-PDF: Non-Parametric Representation of Probability Distributions on the Rotation Manifold

2 code implementations10 Jun 2021 Kieran Murphy, Carlos Esteves, Varun Jampani, Srikumar Ramalingam, Ameesh Makadia

Single image pose estimation is a fundamental problem in many vision and robotics tasks, and existing deep learning approaches suffer by not completely modeling and handling: i) uncertainty about the predictions, and ii) symmetric objects with multiple (sometimes infinite) correct poses.

3D Pose Estimation 3D Rotation Estimation

Discovering 3D Parts from Image Collections

no code implementations ICCV 2021 Chun-Han Yao, Wei-Chih Hung, Varun Jampani, Ming-Hsuan Yang

Reasoning 3D shapes from 2D images is an essential yet challenging task, especially when only single-view images are at our disposal.

Object

SLIDE: Single Image 3D Photography with Soft Layering and Depth-aware Inpainting

no code implementations ICCV 2021 Varun Jampani, Huiwen Chang, Kyle Sargent, Abhishek Kar, Richard Tucker, Michael Krainin, Dominik Kaeser, William T. Freeman, David Salesin, Brian Curless, Ce Liu

We present SLIDE, a modular and unified system for single image 3D photography that uses a simple yet effective soft layering strategy to better preserve appearance details in novel views.

Image Matting

Approximate Bijective Correspondence for isolating factors of variation

1 code implementation29 Sep 2021 Kieran A Murphy, Varun Jampani, Srikumar Ramalingam, Ameesh Makadia

We propose a novel algorithm that relies on a weak form of supervision where the data is partitioned into sets according to certain \textit{inactive} factors of variation.

Contrastive Learning Data Augmentation +1

ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction

1 code implementation NeurIPS 2021 Gengshan Yang, Deqing Sun, Varun Jampani, Daniel Vlasic, Forrester Cole, Ce Liu, Deva Ramanan

The surface embeddings are implemented as coordinate-based MLPs that are fit to each video via consistency and contrastive reconstruction losses. Experimental results show that ViSER compares favorably against prior work on challenging videos of humans with loose clothing and unusual poses as well as animals videos from DAVIS and YTVOS.

3D Shape Reconstruction from Videos

Robust Visual Reasoning via Language Guided Neural Module Networks

no code implementations NeurIPS 2021 Arjun Akula, Varun Jampani, Soravit Changpinyo, Song-Chun Zhu

Neural module networks (NMN) are a popular approach for solving multi-modal tasks such as visual question answering (VQA) and visual referring expression recognition (REF).

Question Answering Referring Expression +2

SOMSI: Spherical Novel View Synthesis With Soft Occlusion Multi-Sphere Images

no code implementations CVPR 2022 Tewodros Habtegebrial, Christiano Gava, Marcel Rogge, Didier Stricker, Varun Jampani

We propose a novel MSI representation called Soft Occlusion MSI (SOMSI) that enables modelling high-dimensional appearance features in MSI while retaining the fast rendering times of a standard MSI.

Novel View Synthesis

Amplitude Spectrum Transformation for Open Compound Domain Adaptive Semantic Segmentation

no code implementations9 Feb 2022 Jogendra Nath Kundu, Akshay Kulkarni, Suvaansh Bhambri, Varun Jampani, R. Venkatesh Babu

However, we find that latent features derived from the Fourier-based amplitude spectrum of deep CNN features hold a more tractable mapping with domain discrimination.

Disentanglement Domain Adaptation +1

Aligning Silhouette Topology for Self-Adaptive 3D Human Pose Recovery

no code implementations NeurIPS 2021 Mugalodi Rakesh, Jogendra Nath Kundu, Varun Jampani, R. Venkatesh Babu

Articulation-centric 2D/3D pose supervision forms the core training objective in most existing 3D human pose estimation techniques.

3D Human Pose Estimation

LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of Feature Similarity

no code implementations6 Apr 2022 Tejan Karmali, Abhinav Atrishi, Sai Sree Harsha, Susmit Agrawal, Varun Jampani, R. Venkatesh Babu

Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image, which are further used to learn landmarks in a semi-supervised manner.

Self-Supervised Learning

Planes vs. Chairs: Category-guided 3D shape learning without any 3D cues

no code implementations21 Apr 2022 Zixuan Huang, Stefan Stojanov, Anh Thai, Varun Jampani, James M. Rehg

We present a novel 3D shape reconstruction method which learns to predict an implicit 3D shape representation from a single RGB image.

3D Shape Reconstruction 3D Shape Representation +1

Balancing Discriminability and Transferability for Source-Free Domain Adaptation

1 code implementation16 Jun 2022 Jogendra Nath Kundu, Akshay Kulkarni, Suvaansh Bhambri, Deepesh Mehta, Shreyas Kulkarni, Varun Jampani, R. Venkatesh Babu

Conventional domain adaptation (DA) techniques aim to improve domain transferability by learning domain-invariant representations; while concurrently preserving the task-discriminability knowledge gathered from the labeled source data.

Semantic Segmentation Source-Free Domain Adaptation

LASSIE: Learning Articulated Shapes from Sparse Image Ensemble via 3D Part Discovery

no code implementations7 Jul 2022 Chun-Han Yao, Wei-Chih Hung, Yuanzhen Li, Michael Rubinstein, Ming-Hsuan Yang, Varun Jampani

In this work, we propose a practical problem setting to estimate 3D pose and shape of animals given only a few (10-30) in-the-wild images of a particular animal species (say, horse).

Hierarchical Semantic Regularization of Latent Spaces in StyleGANs

no code implementations7 Aug 2022 Tejan Karmali, Rishubh Parihar, Susmit Agrawal, Harsh Rangwani, Varun Jampani, Maneesh Singh, R. Venkatesh Babu

The quality of the generated images is predicated on two assumptions; (a) The richness of the hierarchical representations learnt by the generator, and, (b) The linearity and smoothness of the style spaces.

Attribute

Improving GANs for Long-Tailed Data through Group Spectral Regularization

1 code implementation21 Aug 2022 Harsh Rangwani, Naman Jaswani, Tejan Karmali, Varun Jampani, R. Venkatesh Babu

Deep long-tailed learning aims to train useful deep networks on practical, real-world imbalanced distributions, wherein most labels of the tail classes are associated with a few samples.

Conditional Image Generation

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

10 code implementations CVPR 2023 Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, Kfir Aberman

Once the subject is embedded in the output domain of the model, the unique identifier can be used to synthesize novel photorealistic images of the subject contextualized in different scenes.

Diffusion Personalization Image Generation

CPL: Counterfactual Prompt Learning for Vision and Language Models

no code implementations19 Oct 2022 Xuehai He, Diji Yang, Weixi Feng, Tsu-Jui Fu, Arjun Akula, Varun Jampani, Pradyumna Narayana, Sugato Basu, William Yang Wang, Xin Eric Wang

Prompt tuning is a new few-shot transfer learning technique that only tunes the learnable prompt for pre-trained vision and language models such as CLIP.

counterfactual Visual Question Answering

Subsidiary Prototype Alignment for Universal Domain Adaptation

no code implementations28 Oct 2022 Jogendra Nath Kundu, Suvaansh Bhambri, Akshay Kulkarni, Hiran Sarkar, Varun Jampani, R. Venkatesh Babu

Universal Domain Adaptation (UniDA) deals with the problem of knowledge transfer between two datasets with domain-shift as well as category-shift.

Object Recognition Single Particle Analysis +2

Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis

1 code implementation9 Dec 2022 Weixi Feng, Xuehai He, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Pradyumna Narayana, Sugato Basu, Xin Eric Wang, William Yang Wang

In this work, we improve the compositional skills of T2I models, specifically more accurate attribute binding and better image compositions.

Attribute Image Generation

Hi-LASSIE: High-Fidelity Articulated Shape and Skeleton Discovery from Sparse Image Ensemble

1 code implementation CVPR 2023 Chun-Han Yao, Wei-Chih Hung, Yuanzhen Li, Michael Rubinstein, Ming-Hsuan Yang, Varun Jampani

Automatically estimating 3D skeleton, shape, camera viewpoints, and part articulation from sparse in-the-wild image ensembles is a severely under-constrained and challenging problem.

Debiasing Vision-Language Models via Biased Prompts

1 code implementation31 Jan 2023 Ching-Yao Chuang, Varun Jampani, Yuanzhen Li, Antonio Torralba, Stefanie Jegelka

Machine learning models have been shown to inherit biases from their training datasets.

Polynomial Neural Fields for Subband Decomposition and Manipulation

1 code implementation9 Feb 2023 Guandao Yang, Sagie Benaim, Varun Jampani, Kyle Genova, Jonathan T. Barron, Thomas Funkhouser, Bharath Hariharan, Serge Belongie

We use this framework to design Fourier PNFs, which match state-of-the-art performance in signal representation tasks that use neural fields.

LOCATE: Localize and Transfer Object Parts for Weakly Supervised Affordance Grounding

no code implementations CVPR 2023 Gen Li, Varun Jampani, Deqing Sun, Laura Sevilla-Lara

A key step to acquire this skill is to identify what part of the object affords each action, which is called affordance grounding.

Object

ASIC: Aligning Sparse in-the-wild Image Collections

no code implementations ICCV 2023 Kamal Gupta, Varun Jampani, Carlos Esteves, Abhinav Shrivastava, Ameesh Makadia, Noah Snavely, Abhishek Kar

We present a self-supervised technique that directly optimizes on a sparse collection of images of a particular object/object category to obtain consistent dense correspondences across the collection.

Object

ContactArt: Learning 3D Interaction Priors for Category-level Articulated Object and Hand Poses Estimation

no code implementations2 May 2023 Zehao Zhu, Jiashun Wang, Yuzhe Qin, Deqing Sun, Varun Jampani, Xiaolong Wang

We propose a new dataset and a novel approach to learning hand-object interaction priors for hand and articulated object pose estimation.

Hand Pose Estimation Object

LayoutGPT: Compositional Visual Planning and Generation with Large Language Models

1 code implementation NeurIPS 2023 Weixi Feng, Wanrong Zhu, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Xuehai He, Sugato Basu, Xin Eric Wang, William Yang Wang

When combined with a downstream image generation model, LayoutGPT outperforms text-to-image models/systems by 20-40% and achieves comparable performance as human users in designing visual layouts for numerical and spatial correctness.

Indoor Scene Synthesis Text-to-Image Generation

Background Prompting for Improved Object Depth

no code implementations8 Jun 2023 Manel Baradad, Yuanzhen Li, Forrester Cole, Michael Rubinstein, Antonio Torralba, William T. Freeman, Varun Jampani

To infer object depth on a real image, we place the segmented object into the learned background prompt and run off-the-shelf depth networks.

Object

LU-NeRF: Scene and Pose Estimation by Synchronizing Local Unposed NeRFs

no code implementations ICCV 2023 Zezhou Cheng, Carlos Esteves, Varun Jampani, Abhishek Kar, Subhransu Maji, Ameesh Makadia

Consequently, there is growing interest in extending NeRF models to jointly optimize camera poses and scene representation, which offers an alternative to off-the-shelf SfM pipelines which have well-understood failure modes.

Pose Estimation

HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models

2 code implementations13 Jul 2023 Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Wei Wei, Tingbo Hou, Yael Pritch, Neal Wadhwa, Michael Rubinstein, Kfir Aberman

By composing these weights into the diffusion model, coupled with fast finetuning, HyperDreamBooth can generate a person's face in various contexts and styles, with high subject details while also preserving the model's crucial knowledge of diverse styles and semantic modifications.

Diffusion Personalization Tuning Free

OmniControl: Control Any Joint at Any Time for Human Motion Generation

1 code implementation12 Oct 2023 Yiming Xie, Varun Jampani, Lei Zhong, Deqing Sun, Huaizu Jiang

We present a novel approach named OmniControl for incorporating flexible spatial control signals into a text-conditioned human motion generation model based on the diffusion process.

ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs

1 code implementation22 Nov 2023 Viraj Shah, Nataniel Ruiz, Forrester Cole, Erika Lu, Svetlana Lazebnik, Yuanzhen Li, Varun Jampani

Experiments on a wide range of subject and style combinations show that ZipLoRA can generate compelling results with meaningful improvements over baselines in subject and style fidelity while preserving the ability to recontextualize.

Exploring Attribute Variations in Style-based GANs using Diffusion Models

no code implementations27 Nov 2023 Rishubh Parihar, Prasanna Balaji, Raghav Magazine, Sarthak Vora, Tejan Karmali, Varun Jampani, R. Venkatesh Babu

We capitalize on disentangled latent spaces of pretrained GANs and train a Denoising Diffusion Probabilistic Model (DDPM) to learn the latent distribution for diverse edits.

Attribute Denoising

Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence

1 code implementation28 Nov 2023 Junyi Zhang, Charles Herrmann, Junhwa Hur, Eric Chen, Varun Jampani, Deqing Sun, Ming-Hsuan Yang

This paper identifies the importance of being geometry-aware for semantic correspondence and reveals a limitation of the features of current foundation models under simple post-processing.

Animal Pose Estimation Semantic correspondence

One-Shot Open Affordance Learning with Foundation Models

no code implementations29 Nov 2023 Gen Li, Deqing Sun, Laura Sevilla-Lara, Varun Jampani

We introduce One-shot Open Affordance Learning (OOAL), where a model is trained with just one example per base object category, but is expected to identify novel objects and affordances.

UniGS: Unified Representation for Image Generation and Segmentation

1 code implementation4 Dec 2023 Lu Qi, Lehan Yang, Weidong Guo, Yu Xu, Bo Du, Varun Jampani, Ming-Hsuan Yang

On the other hand, the progressive dichotomy module can efficiently decode the synthesized colormap to high-quality entity-level masks in a depth-first binary search without knowing the cluster numbers.

Image Generation Segmentation

Alchemist: Parametric Control of Material Properties with Diffusion Models

no code implementations5 Dec 2023 Prafull Sharma, Varun Jampani, Yuanzhen Li, Xuhui Jia, Dmitry Lagun, Fredo Durand, William T. Freeman, Mark Matthews

We propose a method to control material attributes of objects like roughness, metallic, albedo, and transparency in real images.

NeRFiller: Completing Scenes via Generative 3D Inpainting

no code implementations7 Dec 2023 Ethan Weber, Aleksander Hołyński, Varun Jampani, Saurabh Saxena, Noah Snavely, Abhishek Kar, Angjoo Kanazawa

In contrast to related works, we focus on completing scenes rather than deleting foreground objects, and our approach does not require tight 2D object masks or text.

3D Inpainting

HOI-Diff: Text-Driven Synthesis of 3D Human-Object Interactions using Diffusion Models

no code implementations11 Dec 2023 Xiaogang Peng, Yiming Xie, Zizhao Wu, Varun Jampani, Deqing Sun, Huaizu Jiang

We also develop an affordance prediction diffusion model (APDM) to predict the contacting area between the human and object during the interactions driven by the textual prompt.

Human-Object Interaction Detection Object

DiffusionLight: Light Probes for Free by Painting a Chrome Ball

1 code implementation14 Dec 2023 Pakkapon Phongthawee, Worameth Chinchuthakun, Nontaphat Sinsunthithet, Amit Raj, Varun Jampani, Pramook Khungurn, Supasorn Suwajanakorn

To address this problem, we leverage diffusion models trained on billions of standard images to render a chrome ball into the input image.

Lighting Estimation

ZeroShape: Regression-based Zero-shot Shape Reconstruction

no code implementations21 Dec 2023 Zixuan Huang, Stefan Stojanov, Anh Thai, Varun Jampani, James M. Rehg

In contrast, the traditional approach to this problem is regression-based, where deterministic models are trained to directly regress the object shape.

3D Shape Reconstruction Computational Efficiency +1

Dress-Me-Up: A Dataset & Method for Self-Supervised 3D Garment Retargeting

no code implementations6 Jan 2024 Shanthika Naik, Kunwar Singh, Astitva Srivastava, Dhawal Sirikonda, Amit Raj, Varun Jampani, Avinash Sharma

We propose a novel self-supervised framework for retargeting non-parameterized 3D garments onto 3D human avatars of arbitrary shapes and poses, enabling 3D virtual try-on (VTON).

Virtual Try-on

SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild

no code implementations18 Jan 2024 Andreas Engelhardt, Amit Raj, Mark Boss, Yunzhi Zhang, Abhishek Kar, Yuanzhen Li, Deqing Sun, Ricardo Martin Brualla, Jonathan T. Barron, Hendrik P. A. Lensch, Varun Jampani

We present SHINOBI, an end-to-end framework for the reconstruction of shape, material, and illumination from object images captured with varying lighting, pose, and background.

Inverse Rendering Object

TripoSR: Fast 3D Object Reconstruction from a Single Image

1 code implementation4 Mar 2024 Dmitry Tochilkin, David Pankratz, Zexiang Liu, Zixuan Huang, Adam Letts, Yangguang Li, Ding Liang, Christian Laforte, Varun Jampani, Yan-Pei Cao

This technical report introduces TripoSR, a 3D reconstruction model leveraging transformer architecture for fast feed-forward 3D generation, producing 3D mesh from a single image in under 0. 5 seconds.

3D Generation 3D Object Reconstruction From A Single Image +2

SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion

no code implementations18 Mar 2024 Vikram Voleti, Chun-Han Yao, Mark Boss, Adam Letts, David Pankratz, Dmitry Tochilkin, Christian Laforte, Robin Rombach, Varun Jampani

In this work, we propose SV3D that adapts image-to-video diffusion model for novel multi-view synthesis and 3D generation, thereby leveraging the generalization and multi-view consistency of the video models, while further adding explicit camera control for NVS.

3D Generation 3D Reconstruction +2

WordRobe: Text-Guided Generation of Textured 3D Garments

no code implementations26 Mar 2024 Astitva Srivastava, Pranav Manu, Amit Raj, Varun Jampani, Avinash Sharma

We achieve this by first learning a latent representation of 3D garments using a novel coarse-to-fine training strategy and a loss for latent disentanglement, promoting better latent interpolation.

Disentanglement text-guided-generation +1

3D Congealing: 3D-Aware Image Alignment in the Wild

no code implementations2 Apr 2024 Yunzhi Zhang, Zizhang Li, Amit Raj, Andreas Engelhardt, Yuanzhen Li, Tingbo Hou, Jiajun Wu, Varun Jampani

The framework optimizes for the canonical representation together with the pose for each input image, and a per-image coordinate map that warps 2D pixel coordinates to the 3D canonical frame to account for the shape matching.

Pose Estimation

MVD-Fusion: Single-view 3D via Depth-consistent Multi-view Generation

no code implementations4 Apr 2024 Hanzhe Hu, Zhizhuo Zhou, Varun Jampani, Shubham Tulsiani

We present MVD-Fusion: a method for single-view 3D inference via generative modeling of multi-view-consistent RGB-D images.

Denoising Depth Estimation +1

ZeST: Zero-Shot Material Transfer from a Single Image

no code implementations9 Apr 2024 Ta-Ying Cheng, Prafull Sharma, Andrew Markham, Niki Trigoni, Varun Jampani

We propose ZeST, a method for zero-shot material transfer to an object in the input image given a material exemplar image.

Object

Probing the 3D Awareness of Visual Foundation Models

1 code implementation12 Apr 2024 Mohamed El Banani, Amit Raj, Kevis-Kokitsi Maninis, Abhishek Kar, Yuanzhen Li, Michael Rubinstein, Deqing Sun, Leonidas Guibas, Justin Johnson, Varun Jampani

Given that such models can classify, delineate, and localize objects in 2D, we ask whether they also represent their 3D structure?

Shaping Realities: Enhancing 3D Generative AI with Fabrication Constraints

no code implementations15 Apr 2024 Faraz Faruqi, Yingtao Tian, Vrushank Phadnis, Varun Jampani, Stefanie Mueller

This workshop paper highlights the limitations of generative AI tools in translating digital creations into the physical world and proposes new augmentations to generative AI tools for creating physically viable 3D models.

Cannot find the paper you are looking for? You can Submit a new open access paper.