Search Results for author: Vibhav Vineet

Found 49 papers, 23 papers with code

Navigating Hallucinations for Reasoning of Unintentional Activities

no code implementations • 29 Feb 2024 • Shresth Grover, Vibhav Vineet, Yogesh S Rawat

In this work we present a novel task of understanding unintentional human activities in videos.

Paper
Add Code

DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models

no code implementations • 21 Dec 2023 • Brian Nlong Zhao, Yuhang Xiao, Jiashu Xu, Xinyang Jiang, Yifan Yang, Dongsheng Li, Laurent Itti, Vibhav Vineet, Yunhao Ge

We introduce a solution that allows a pretrained T2I diffusion model to learn a set of soft prompts, enabling the generation of novel images by sampling prompts from the learned distribution.

Text to 3D

Paper
Add Code

PEEKABOO: Interactive Video Generation via Masked-Diffusion

1 code implementation • 12 Dec 2023 • Yash Jain, Anshul Nasery, Vibhav Vineet, Harkirat Behl

Recently there has been a lot of progress in text-to-video generation, with state-of-the-art models being capable of generating high quality, realistic videos.

Text-to-Video Generation Video Generation

Paper
Code

DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets

1 code implementation • NeurIPS 2023 • Yash Jain, Harkirat Behl, Zsolt Kira, Vibhav Vineet

Construction of a universal detector poses a crucial question: How can we most effectively train a model on a large mixture of datasets?

object-detection Object Detection

Paper
Code

Efficiently Robustify Pre-trained Models

no code implementations • ICCV 2023 • Nishant Jain, Harkirat Behl, Yogesh Singh Rawat, Vibhav Vineet

A recent trend in deep learning algorithms has been towards training large scale models, having high parameter count and trained on big dataset.

Transfer Learning

Paper
Add Code

Beyond Generation: Harnessing Text to Image Models for Object Detection and Segmentation

1 code implementation • 12 Sep 2023 • Yunhao Ge, Jiashu Xu, Brian Nlong Zhao, Neel Joshi, Laurent Itti, Vibhav Vineet

A foreground-background segmentation algorithm is then used to generate foreground object masks.

Image Captioning Image Generation +3

Paper
Code

Robustness Analysis on Foundational Segmentation Models

no code implementations • 15 Jun 2023 • Madeline Chantry Schiappa, Sachidanand VS, Yunhao Ge, Ondrej Miksik, Yogesh S. Rawat, Vibhav Vineet

Due to the increase in computational resources and accessibility of data, an increase in large, deep learning models trained on copious amounts of data using self-supervised or semi-supervised learning have emerged.

object-detection Object Detection +1

Paper
Add Code

A Large-Scale Analysis on Self-Supervised Video Representation Learning

no code implementations • 9 Jun 2023 • Akash Kumar, Ashlesha Kumar, Vibhav Vineet, Yogesh Singh Rawat

In this work, we first provide a benchmark that enables a comparison of existing approaches on the same ground.

Ranked #3 on Self-Supervised Action Recognition on UCF101

Benchmarking Representation Learning +2

Paper
Add Code

Controllable Text-to-Image Generation with GPT-4

no code implementations • 29 May 2023 • Tianjun Zhang, Yi Zhang, Vibhav Vineet, Neel Joshi, Xin Wang

Control-GPT works by querying GPT-4 to write TikZ code, and the generated sketches are used as references alongside the text instructions for diffusion models (e. g., ControlNet) to generate photo-realistic images.

Instruction Following Text-to-Image Generation

Paper
Add Code

PLEX: Making the Most of the Available Data for Robotic Manipulation Pretraining

no code implementations • 15 Mar 2023 • Garrett Thomas, Ching-An Cheng, Ricky Loynd, Felipe Vieira Frujeri, Vibhav Vineet, Mihai Jalobeanu, Andrey Kolobov

A rich representation is key to general robotic manipulation, but existing approaches to representation learning require large amounts of multimodal demonstrations.

Representation Learning

Paper
Add Code

Exploring the Sim2Real Gap Using Digital Twins

no code implementations • ICCV 2023 • Sruthi Sudhakar, Jon Hanzelka, Josh Bobillot, Tanmay Randhavane, Neel Joshi, Vibhav Vineet

An emerging alternative is to use synthetic data, but if the synthetic data is not similar enough to the real data, the performance is typically below that of training with real data.

Instance Segmentation object-detection +2

Paper
Add Code

A Large-Scale Robustness Analysis of Video Action Recognition Models

no code implementations • CVPR 2023 • Madeline Chantry Schiappa, Naman Biyani, Prudvi Kamtam, Shruti Vyas, Hamid Palangi, Vibhav Vineet, Yogesh S. Rawat

In this work, we perform a large-scale robustness analysis of these existing models for video action recognition.

Action Recognition Temporal Action Localization

Paper
Add Code

Benchmarking Spatial Relationships in Text-to-Image Generation

1 code implementation • 20 Dec 2022 • Tejas Gokhale, Hamid Palangi, Besmira Nushi, Vibhav Vineet, Eric Horvitz, Ece Kamar, Chitta Baral, Yezhou Yang

We investigate the ability of T2I models to generate correct spatial relationships among objects and present VISOR, an evaluation metric that captures how accurately the spatial relationship described in text is generated in the image.

Benchmarking Text-to-Image Generation

Paper
Code

EM-Paste: EM-guided Cut-Paste with DALL-E Augmentation for Image-level Weakly Supervised Instance Segmentation

1 code implementation • 15 Dec 2022 • Yunhao Ge, Jiashu Xu, Brian Nlong Zhao, Laurent Itti, Vibhav Vineet

Finally, the third component creates a large-scale pseudo-labeled instance segmentation training dataset by compositing the foreground object masks onto the original and generated background images.

Instance Segmentation Object +4

Paper
Code

Instance-Aware Image Completion

no code implementations • 22 Oct 2022 • Jinoh Cho, Minguk Kang, Vibhav Vineet, Jaesik Park

However, existing image completion methods tend to fill in the missing region with the surrounding texture instead of hallucinating a visual instance that is suitable in accordance with the context of the scene.

Image Generation object-detection +2

Paper
Add Code

Learning to Simulate Realistic LiDARs

no code implementations • 22 Sep 2022 • Benoit Guillard, Sai Vemprala, Jayesh K. Gupta, Ondrej Miksik, Vibhav Vineet, Pascal Fua, Ashish Kapoor

Simulating realistic sensors is a challenging part in data generation for autonomous systems, often involving carefully handcrafted sensor design, scene properties, and physics modeling.

Paper
Add Code

Neural-Sim: Learning to Generate Training Data with NeRF

1 code implementation • 22 Jul 2022 • Yunhao Ge, Harkirat Behl, Jiashu Xu, Suriya Gunasekar, Neel Joshi, Yale Song, Xin Wang, Laurent Itti, Vibhav Vineet

However, existing approaches either require human experts to manually tune each scene property or use automatic methods that provide little to no control; this requires rendering large amounts of random data variations, which is slow and is often suboptimal for the target domain.

Object Detection

154

Paper
Code

TASKOGRAPHY: Evaluating robot task planning over large 3D scene graphs

1 code implementation • 11 Jul 2022 • Christopher Agia, Krishna Murthy Jatavallabhula, Mohamed Khodeir, Ondrej Miksik, Vibhav Vineet, Mustafa Mukadam, Liam Paull, Florian Shkurti

3D scene graphs (3DSGs) are an emerging description; unifying symbolic, topological, and metric scene representations.

Benchmarking Representation Learning +1

Paper
Code

Scaling Novel Object Detection with Weakly Supervised Detection Transformers

1 code implementation • 11 Jul 2022 • Tyler LaBonte, Yale Song, Xin Wang, Vibhav Vineet, Neel Joshi

A critical object detection task is finetuning an existing model to detect novel objects, but the standard workflow requires bounding box annotations which are time-consuming and expensive to collect.

Multiple Instance Learning Novel Object Detection +4

Paper
Code

Robustness Analysis of Video-Language Models Against Visual and Language Perturbations

1 code implementation • 5 Jul 2022 • Madeline C. Schiappa, Shruti Vyas, Hamid Palangi, Yogesh S. Rawat, Vibhav Vineet

Joint visual and language modeling on large-scale datasets has recently shown good progress in multi-modal tasks when compared to single modal learning.

Language Modelling Retrieval +2

Paper
Code

Large-scale Robustness Analysis of Video Action Recognition Models

1 code implementation • 4 Jul 2022 • Madeline Chantry Schiappa, Naman Biyani, Prudvi Kamtam, Shruti Vyas, Hamid Palangi, Vibhav Vineet, Yogesh Rawat

In this work, we perform a large-scale robustness analysis of these existing models for video action recognition.

Action Recognition Temporal Action Localization

Paper
Code

DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection

no code implementations • 20 Jun 2022 • Yunhao Ge, Jiashu Xu, Brian Nlong Zhao, Neel Joshi, Laurent Itti, Vibhav Vineet

For foreground object mask generation, we use a simple textual template with object class name as input to DALL-E to generate a diverse set of foreground images.

Image Captioning Image Generation +4

Paper
Add Code

Missingness Bias in Model Debugging

1 code implementation • ICLR 2022 • Saachi Jain, Hadi Salman, Eric Wong, Pengchuan Zhang, Vibhav Vineet, Sai Vemprala, Aleksander Madry

Missingness, or the absence of features from an input, is a concept fundamental to many model debugging tools.

Paper
Code

Image Retrieval from Contextual Descriptions

1 code implementation • ACL 2022 • Benno Krojer, Vaibhav Adlakha, Vibhav Vineet, Yash Goyal, Edoardo Ponti, Siva Reddy

In particular, models are tasked with retrieving the correct image from a set of 10 minimally contrastive candidates based on a contextual description.

Ranked #1 on Image Retrieval on ImageCoDe

Image Retrieval Retrieval

Paper
Code

Inferring Articulated Rigid Body Dynamics from RGBD Video

1 code implementation • 20 Mar 2022 • Eric Heiden, Ziang Liu, Vibhav Vineet, Erwin Coumans, Gaurav S. Sukhatme

Being able to reproduce physical phenomena ranging from light interaction to contact mechanics, simulators are becoming increasingly useful in more and more application domains where real-world interaction or labeled data are difficult to obtain.

Contact mechanics Inverse Rendering

1,147

Paper
Code

One Network Doesn't Rule Them All: Moving Beyond Handcrafted Architectures in Self-Supervised Learning

no code implementations • 15 Mar 2022 • Sharath Girish, Debadeepta Dey, Neel Joshi, Vibhav Vineet, Shital Shah, Caio Cesar Teodoro Mendes, Abhinav Shrivastava, Yale Song

We conduct a large-scale study with over 100 variants of ResNet and MobileNet architectures and evaluate them across 11 downstream scenarios in the SSL setting.

Image Classification Self-Supervised Learning

Paper
Add Code

Robust Contrastive Learning against Noisy Views

1 code implementation • CVPR 2022 • Ching-Yao Chuang, R Devon Hjelm, Xin Wang, Vibhav Vineet, Neel Joshi, Antonio Torralba, Stefanie Jegelka, Yale Song

Contrastive learning relies on an assumption that positive pairs contain related views, e. g., patches of an image or co-occurring multimodal signals of a video, that share certain underlying information about an instance.

Binary Classification Contrastive Learning

Paper
Code

Learning to Align Sequential Actions in the Wild

no code implementations • CVPR 2022 • Weizhe Liu, Bugra Tekin, Huseyin Coskun, Vibhav Vineet, Pascal Fua, Marc Pollefeys

To this end, we propose an approach to enforce temporal priors on the optimal transport matrix, which leverages temporal consistency, while allowing for variations in the order of actions.

Representation Learning

Paper
Add Code

CausalCity: Complex Simulations with Agency for Causal Discovery and Reasoning

no code implementations • 25 Jun 2021 • Daniel McDuff, Yale Song, Jiyoung Lee, Vibhav Vineet, Sai Vemprala, Nicholas Gyde, Hadi Salman, Shuang Ma, Kwanghoon Sohn, Ashish Kapoor

The ability to perform causal and counterfactual reasoning are central properties of human intelligence.

Causal Discovery counterfactual +2

Paper
Add Code

3DB: A Framework for Debugging Computer Vision Models

1 code implementation • 7 Jun 2021 • Guillaume Leclerc, Hadi Salman, Andrew Ilyas, Sai Vemprala, Logan Engstrom, Vibhav Vineet, Kai Xiao, Pengchuan Zhang, Shibani Santurkar, Greg Yang, Ashish Kapoor, Aleksander Madry

We introduce 3DB: an extendable, unified framework for testing and debugging vision models using photorealistic simulation.

123

Paper
Code

RANP: Resource Aware Neuron Pruning at Initialization for 3D CNNs

1 code implementation • 9 Feb 2021 • Zhiwei Xu, Thalaiyasingam Ajanthan, Vibhav Vineet, Richard Hartley

In this work, we introduce a Resource Aware Neuron Pruning (RANP) algorithm that prunes 3D CNNs at initialization to high sparsity levels.

3D Semantic Segmentation Stereo Matching +1

Paper
Code

Prediction of Object Geometry from Acoustic Scattering Using Convolutional Neural Networks

no code implementations • 21 Oct 2020 • Ziqi Fan, Vibhav Vineet, Chenshen Lu, T. W. Wu, Kyla McMullen

The present work proposes a method to infer object geometry from scattering features by training convolutional neural networks.

Paper
Add Code

RANP: Resource Aware Neuron Pruning at Initialization for 3D CNNs

1 code implementation • 6 Oct 2020 • Zhiwei Xu, Thalaiyasingam Ajanthan, Vibhav Vineet, Richard Hartley

Specifically, the core idea is to obtain an importance score for each neuron based on their sensitivity to the loss function.

3D Semantic Segmentation Video Classification

Paper
Code

AutoSimulate: (Quickly) Learning Synthetic Data Generation

no code implementations • ECCV 2020 • Harkirat Singh Behl, Atılım Güneş Baydin, Ran Gal, Philip H. S. Torr, Vibhav Vineet

Simulation is increasingly being used for generating large labelled datasets in many machine learning problems.

Synthetic Data Generation

Paper
Add Code

Depth Completion Using a View-constrained Deep Prior

no code implementations • 21 Jan 2020 • Pallabi Ghosh, Vibhav Vineet, Larry S. Davis, Abhinav Shrivastava, Sudipta Sinha, Neel Joshi

Given color images and noisy and incomplete target depth maps, we optimize a randomly-initialized CNN model to reconstruct a depth map restored by virtue of using the CNN network structure as a prior combined with a view-constrained photo-consistency loss.

Depth Completion Image Denoising

Paper
Add Code

Fast acoustic scattering using convolutional neural networks

1 code implementation • 30 Oct 2019 • Ziqi Fan, Vibhav Vineet, Hannes Gamper, Nikunj Raghuvanshi

Diffracted scattering and occlusion are important acoustic effects in interactive auralization and noise control applications, typically requiring expensive numerical simulation.

regression

Paper
Code

Learning Visuomotor Policies for Aerial Navigation Using Cross-Modal Representations

2 code implementations • 16 Sep 2019 • Rogerio Bonatti, Ratnesh Madaan, Vibhav Vineet, Sebastian Scherer, Ashish Kapoor

We analyze the rich latent spaces learned with our proposed representations, and show that the use of our cross-modal architecture significantly improves control policy performance as compared to end-to-end learning or purely unsupervised feature extractors.

Drone navigation Imitation Learning

191

Paper
Code

Live Reconstruction of Large-Scale Dynamic Outdoor Worlds

1 code implementation • 15 Mar 2019 • Ondrej Miksik, Vibhav Vineet

For each time step, our dynamic map maintains a relative pose of each volume with respect to the stationary background.

3D Reconstruction Pose Estimation

Paper
Code

Privacy-Preserving Action Recognition using Coded Aperture Videos

no code implementations • 25 Feb 2019 • Zihao W. Wang, Vibhav Vineet, Francesco Pittaluga, Sudipta Sinha, Oliver Cossairt, Sing Bing Kang

We propose a lens-free coded aperture camera system for human action recognition that is privacy-preserving.

Action Recognition Image Restoration +3

Paper
Add Code

Photorealistic Image Synthesis for Object Instance Detection

no code implementations • 9 Feb 2019 • Tomas Hodan, Vibhav Vineet, Ran Gal, Emanuel Shalev, Jon Hanzelka, Treb Connell, Pedro Urbina, Sudipta N. Sinha, Brian Guenter

We present an approach to synthesize highly photorealistic images of 3D object models, which we use to train a convolutional neural network for detecting the objects in real images.

6D Pose Estimation 6D Pose Estimation using RGB +3

Paper
Add Code

Playing for Data: Ground Truth from Computer Games

2 code implementations • 7 Aug 2016 • Stephan R. Richter, Vibhav Vineet, Stefan Roth, Vladlen Koltun

Recent progress in computer vision has been driven by high-capacity models trained on large datasets.

Semantic Segmentation

Paper
Code

Feature Space Optimization for Semantic Video Segmentation

1 code implementation • CVPR 2016 • Abhijit Kundu, Vibhav Vineet, Vladlen Koltun

We present an approach to long-range spatio-temporal regularization in semantic video segmentation.

Segmentation Structured Prediction +2

Paper
Code

Dense Monocular Depth Estimation in Complex Dynamic Scenes

no code implementations • CVPR 2016 • Rene Ranftl, Vibhav Vineet, Qifeng Chen, Vladlen Koltun

We present an approach to dense depth estimation from a single monocular camera that is moving through a dynamic scene.

Monocular Depth Estimation Motion Segmentation +1

Paper
Add Code

SemanticPaint: A Framework for the Interactive Segmentation of 3D Scenes

no code implementations • 13 Oct 2015 • Stuart Golodetz, Michael Sapienza, Julien P. C. Valentin, Vibhav Vineet, Ming-Ming Cheng, Anurag Arnab, Victor A. Prisacariu, Olaf Kähler, Carl Yuheng Ren, David W. Murray, Shahram Izadi, Philip H. S. Torr

We present an open-source, real-time implementation of SemanticPaint, a system for geometric reconstruction, object-class segmentation and learning of 3D scenes.

Interactive Segmentation Segmentation

Paper
Add Code

Conditional Random Fields as Recurrent Neural Networks

6 code implementations • ICCV 2015 • Shuai Zheng, Sadeep Jayasumana, Bernardino Romera-Paredes, Vibhav Vineet, Zhizhong Su, Dalong Du, Chang Huang, Philip H. S. Torr

Pixel-level labelling tasks, such as semantic segmentation, play a central role in image understanding.

Ranked #36 on Semantic Segmentation on PASCAL VOC 2012 test

Image Segmentation Real-Time Semantic Segmentation +1

1,334

Paper
Code

Dense Semantic Image Segmentation with Objects and Attributes

no code implementations • CVPR 2014 • Shuai Zheng, Ming-Ming Cheng, Jonathan Warrell, Paul Sturgess, Vibhav Vineet, Carsten Rother, Philip H. S. Torr

The concepts of objects and attributes are both important for describing images precisely, since verbal descriptions often contain both adjectives and nouns (e. g. "I see a shiny red chair').

Attribute Image Segmentation +2

Paper
Add Code

A Tiered Move-making Algorithm for General Non-submodular Pairwise Energies

no code implementations • 25 Mar 2014 • Vibhav Vineet, Jonathan Warrell, Philip H. S. Torr

The algorithm converges to a local minimum for any general pairwise potential, and we give a theoretical analysis of the properties of the algorithm, characterizing the situations in which we can expect good performance.

Image Denoising Image Segmentation +3

Paper
Add Code

Higher Order Priors for Joint Intrinsic Image, Objects, and Attributes Estimation

no code implementations • NeurIPS 2013 • Vibhav Vineet, Carsten Rother, Philip Torr

Many methods have been proposed to recover the intrinsic scene properties such as shape, reflectance and illumination from a single image.

Paper
Add Code

ImageSpirit: Verbal Guided Image Parsing

no code implementations • 16 Oct 2013 • Ming-Ming Cheng, Shuai Zheng, Wen-Yan Lin, Jonathan Warrell, Vibhav Vineet, Paul Sturgess, Nigel Crook, Niloy Mitra, Philip Torr

This allows us to formulate the image parsing problem as one of jointly estimating per-pixel object and attribute labels from a set of training images.

Attribute Object

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.