Search Results for author: Vibhav Vineet

Found 49 papers, 23 papers with code

Navigating Hallucinations for Reasoning of Unintentional Activities

no code implementations29 Feb 2024 Shresth Grover, Vibhav Vineet, Yogesh S Rawat

In this work we present a novel task of understanding unintentional human activities in videos.

Hallucination Navigate

DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models

no code implementations21 Dec 2023 Brian Nlong Zhao, Yuhang Xiao, Jiashu Xu, Xinyang Jiang, Yifan Yang, Dongsheng Li, Laurent Itti, Vibhav Vineet, Yunhao Ge

We introduce a solution that allows a pretrained T2I diffusion model to learn a set of soft prompts, enabling the generation of novel images by sampling prompts from the learned distribution.

Text to 3D

PEEKABOO: Interactive Video Generation via Masked-Diffusion

1 code implementation12 Dec 2023 Yash Jain, Anshul Nasery, Vibhav Vineet, Harkirat Behl

Recently there has been a lot of progress in text-to-video generation, with state-of-the-art models being capable of generating high quality, realistic videos.

Text-to-Video Generation Video Generation

DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets

1 code implementation NeurIPS 2023 Yash Jain, Harkirat Behl, Zsolt Kira, Vibhav Vineet

Construction of a universal detector poses a crucial question: How can we most effectively train a model on a large mixture of datasets?

object-detection Object Detection

Efficiently Robustify Pre-trained Models

no code implementations ICCV 2023 Nishant Jain, Harkirat Behl, Yogesh Singh Rawat, Vibhav Vineet

A recent trend in deep learning algorithms has been towards training large scale models, having high parameter count and trained on big dataset.

Transfer Learning

Robustness Analysis on Foundational Segmentation Models

no code implementations15 Jun 2023 Madeline Chantry Schiappa, Sachidanand VS, Yunhao Ge, Ondrej Miksik, Yogesh S. Rawat, Vibhav Vineet

Due to the increase in computational resources and accessibility of data, an increase in large, deep learning models trained on copious amounts of data using self-supervised or semi-supervised learning have emerged.

object-detection Object Detection +1

Controllable Text-to-Image Generation with GPT-4

no code implementations29 May 2023 Tianjun Zhang, Yi Zhang, Vibhav Vineet, Neel Joshi, Xin Wang

Control-GPT works by querying GPT-4 to write TikZ code, and the generated sketches are used as references alongside the text instructions for diffusion models (e. g., ControlNet) to generate photo-realistic images.

Instruction Following Text-to-Image Generation

PLEX: Making the Most of the Available Data for Robotic Manipulation Pretraining

no code implementations15 Mar 2023 Garrett Thomas, Ching-An Cheng, Ricky Loynd, Felipe Vieira Frujeri, Vibhav Vineet, Mihai Jalobeanu, Andrey Kolobov

A rich representation is key to general robotic manipulation, but existing approaches to representation learning require large amounts of multimodal demonstrations.

Representation Learning

Exploring the Sim2Real Gap Using Digital Twins

no code implementations ICCV 2023 Sruthi Sudhakar, Jon Hanzelka, Josh Bobillot, Tanmay Randhavane, Neel Joshi, Vibhav Vineet

An emerging alternative is to use synthetic data, but if the synthetic data is not similar enough to the real data, the performance is typically below that of training with real data.

Instance Segmentation object-detection +2

Benchmarking Spatial Relationships in Text-to-Image Generation

1 code implementation20 Dec 2022 Tejas Gokhale, Hamid Palangi, Besmira Nushi, Vibhav Vineet, Eric Horvitz, Ece Kamar, Chitta Baral, Yezhou Yang

We investigate the ability of T2I models to generate correct spatial relationships among objects and present VISOR, an evaluation metric that captures how accurately the spatial relationship described in text is generated in the image.

Benchmarking Text-to-Image Generation

EM-Paste: EM-guided Cut-Paste with DALL-E Augmentation for Image-level Weakly Supervised Instance Segmentation

1 code implementation15 Dec 2022 Yunhao Ge, Jiashu Xu, Brian Nlong Zhao, Laurent Itti, Vibhav Vineet

Finally, the third component creates a large-scale pseudo-labeled instance segmentation training dataset by compositing the foreground object masks onto the original and generated background images.

Instance Segmentation Object +4

Instance-Aware Image Completion

no code implementations22 Oct 2022 Jinoh Cho, Minguk Kang, Vibhav Vineet, Jaesik Park

However, existing image completion methods tend to fill in the missing region with the surrounding texture instead of hallucinating a visual instance that is suitable in accordance with the context of the scene.

Image Generation object-detection +2

Learning to Simulate Realistic LiDARs

no code implementations22 Sep 2022 Benoit Guillard, Sai Vemprala, Jayesh K. Gupta, Ondrej Miksik, Vibhav Vineet, Pascal Fua, Ashish Kapoor

Simulating realistic sensors is a challenging part in data generation for autonomous systems, often involving carefully handcrafted sensor design, scene properties, and physics modeling.

Neural-Sim: Learning to Generate Training Data with NeRF

1 code implementation22 Jul 2022 Yunhao Ge, Harkirat Behl, Jiashu Xu, Suriya Gunasekar, Neel Joshi, Yale Song, Xin Wang, Laurent Itti, Vibhav Vineet

However, existing approaches either require human experts to manually tune each scene property or use automatic methods that provide little to no control; this requires rendering large amounts of random data variations, which is slow and is often suboptimal for the target domain.

Object Detection

Scaling Novel Object Detection with Weakly Supervised Detection Transformers

1 code implementation11 Jul 2022 Tyler LaBonte, Yale Song, Xin Wang, Vibhav Vineet, Neel Joshi

A critical object detection task is finetuning an existing model to detect novel objects, but the standard workflow requires bounding box annotations which are time-consuming and expensive to collect.

Multiple Instance Learning Novel Object Detection +4

Robustness Analysis of Video-Language Models Against Visual and Language Perturbations

1 code implementation5 Jul 2022 Madeline C. Schiappa, Shruti Vyas, Hamid Palangi, Yogesh S. Rawat, Vibhav Vineet

Joint visual and language modeling on large-scale datasets has recently shown good progress in multi-modal tasks when compared to single modal learning.

Language Modelling Retrieval +2

DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection

no code implementations20 Jun 2022 Yunhao Ge, Jiashu Xu, Brian Nlong Zhao, Neel Joshi, Laurent Itti, Vibhav Vineet

For foreground object mask generation, we use a simple textual template with object class name as input to DALL-E to generate a diverse set of foreground images.

Image Captioning Image Generation +4

Missingness Bias in Model Debugging

1 code implementation ICLR 2022 Saachi Jain, Hadi Salman, Eric Wong, Pengchuan Zhang, Vibhav Vineet, Sai Vemprala, Aleksander Madry

Missingness, or the absence of features from an input, is a concept fundamental to many model debugging tools.

Image Retrieval from Contextual Descriptions

1 code implementation ACL 2022 Benno Krojer, Vaibhav Adlakha, Vibhav Vineet, Yash Goyal, Edoardo Ponti, Siva Reddy

In particular, models are tasked with retrieving the correct image from a set of 10 minimally contrastive candidates based on a contextual description.

Image Retrieval Retrieval

Inferring Articulated Rigid Body Dynamics from RGBD Video

1 code implementation20 Mar 2022 Eric Heiden, Ziang Liu, Vibhav Vineet, Erwin Coumans, Gaurav S. Sukhatme

Being able to reproduce physical phenomena ranging from light interaction to contact mechanics, simulators are becoming increasingly useful in more and more application domains where real-world interaction or labeled data are difficult to obtain.

Contact mechanics Inverse Rendering

Robust Contrastive Learning against Noisy Views

1 code implementation CVPR 2022 Ching-Yao Chuang, R Devon Hjelm, Xin Wang, Vibhav Vineet, Neel Joshi, Antonio Torralba, Stefanie Jegelka, Yale Song

Contrastive learning relies on an assumption that positive pairs contain related views, e. g., patches of an image or co-occurring multimodal signals of a video, that share certain underlying information about an instance.

Binary Classification Contrastive Learning

Learning to Align Sequential Actions in the Wild

no code implementations CVPR 2022 Weizhe Liu, Bugra Tekin, Huseyin Coskun, Vibhav Vineet, Pascal Fua, Marc Pollefeys

To this end, we propose an approach to enforce temporal priors on the optimal transport matrix, which leverages temporal consistency, while allowing for variations in the order of actions.

Representation Learning

3DB: A Framework for Debugging Computer Vision Models

1 code implementation7 Jun 2021 Guillaume Leclerc, Hadi Salman, Andrew Ilyas, Sai Vemprala, Logan Engstrom, Vibhav Vineet, Kai Xiao, Pengchuan Zhang, Shibani Santurkar, Greg Yang, Ashish Kapoor, Aleksander Madry

We introduce 3DB: an extendable, unified framework for testing and debugging vision models using photorealistic simulation.

RANP: Resource Aware Neuron Pruning at Initialization for 3D CNNs

1 code implementation9 Feb 2021 Zhiwei Xu, Thalaiyasingam Ajanthan, Vibhav Vineet, Richard Hartley

In this work, we introduce a Resource Aware Neuron Pruning (RANP) algorithm that prunes 3D CNNs at initialization to high sparsity levels.

3D Semantic Segmentation Stereo Matching +1

Prediction of Object Geometry from Acoustic Scattering Using Convolutional Neural Networks

no code implementations21 Oct 2020 Ziqi Fan, Vibhav Vineet, Chenshen Lu, T. W. Wu, Kyla McMullen

The present work proposes a method to infer object geometry from scattering features by training convolutional neural networks.

RANP: Resource Aware Neuron Pruning at Initialization for 3D CNNs

1 code implementation6 Oct 2020 Zhiwei Xu, Thalaiyasingam Ajanthan, Vibhav Vineet, Richard Hartley

Specifically, the core idea is to obtain an importance score for each neuron based on their sensitivity to the loss function.

3D Semantic Segmentation Video Classification

Depth Completion Using a View-constrained Deep Prior

no code implementations21 Jan 2020 Pallabi Ghosh, Vibhav Vineet, Larry S. Davis, Abhinav Shrivastava, Sudipta Sinha, Neel Joshi

Given color images and noisy and incomplete target depth maps, we optimize a randomly-initialized CNN model to reconstruct a depth map restored by virtue of using the CNN network structure as a prior combined with a view-constrained photo-consistency loss.

Depth Completion Image Denoising

Fast acoustic scattering using convolutional neural networks

1 code implementation30 Oct 2019 Ziqi Fan, Vibhav Vineet, Hannes Gamper, Nikunj Raghuvanshi

Diffracted scattering and occlusion are important acoustic effects in interactive auralization and noise control applications, typically requiring expensive numerical simulation.

regression

Learning Visuomotor Policies for Aerial Navigation Using Cross-Modal Representations

2 code implementations16 Sep 2019 Rogerio Bonatti, Ratnesh Madaan, Vibhav Vineet, Sebastian Scherer, Ashish Kapoor

We analyze the rich latent spaces learned with our proposed representations, and show that the use of our cross-modal architecture significantly improves control policy performance as compared to end-to-end learning or purely unsupervised feature extractors.

Drone navigation Imitation Learning

Live Reconstruction of Large-Scale Dynamic Outdoor Worlds

1 code implementation15 Mar 2019 Ondrej Miksik, Vibhav Vineet

For each time step, our dynamic map maintains a relative pose of each volume with respect to the stationary background.

3D Reconstruction Pose Estimation

Photorealistic Image Synthesis for Object Instance Detection

no code implementations9 Feb 2019 Tomas Hodan, Vibhav Vineet, Ran Gal, Emanuel Shalev, Jon Hanzelka, Treb Connell, Pedro Urbina, Sudipta N. Sinha, Brian Guenter

We present an approach to synthesize highly photorealistic images of 3D object models, which we use to train a convolutional neural network for detecting the objects in real images.

6D Pose Estimation 6D Pose Estimation using RGB +3

Playing for Data: Ground Truth from Computer Games

2 code implementations7 Aug 2016 Stephan R. Richter, Vibhav Vineet, Stefan Roth, Vladlen Koltun

Recent progress in computer vision has been driven by high-capacity models trained on large datasets.

Semantic Segmentation

Dense Semantic Image Segmentation with Objects and Attributes

no code implementations CVPR 2014 Shuai Zheng, Ming-Ming Cheng, Jonathan Warrell, Paul Sturgess, Vibhav Vineet, Carsten Rother, Philip H. S. Torr

The concepts of objects and attributes are both important for describing images precisely, since verbal descriptions often contain both adjectives and nouns (e. g. "I see a shiny red chair').

Attribute Image Segmentation +2

A Tiered Move-making Algorithm for General Non-submodular Pairwise Energies

no code implementations25 Mar 2014 Vibhav Vineet, Jonathan Warrell, Philip H. S. Torr

The algorithm converges to a local minimum for any general pairwise potential, and we give a theoretical analysis of the properties of the algorithm, characterizing the situations in which we can expect good performance.

Image Denoising Image Segmentation +3

Higher Order Priors for Joint Intrinsic Image, Objects, and Attributes Estimation

no code implementations NeurIPS 2013 Vibhav Vineet, Carsten Rother, Philip Torr

Many methods have been proposed to recover the intrinsic scene properties such as shape, reflectance and illumination from a single image.

ImageSpirit: Verbal Guided Image Parsing

no code implementations16 Oct 2013 Ming-Ming Cheng, Shuai Zheng, Wen-Yan Lin, Jonathan Warrell, Vibhav Vineet, Paul Sturgess, Nigel Crook, Niloy Mitra, Philip Torr

This allows us to formulate the image parsing problem as one of jointly estimating per-pixel object and attribute labels from a set of training images.

Attribute Object

Cannot find the paper you are looking for? You can Submit a new open access paper.