Search Results for author: Neel Joshi

Found 21 papers, 6 papers with code

Controllable Text-to-Image Generation with GPT-4

no code implementations29 May 2023 Tianjun Zhang, Yi Zhang, Vibhav Vineet, Neel Joshi, Xin Wang

Control-GPT works by querying GPT-4 to write TikZ code, and the generated sketches are used as references alongside the text instructions for diffusion models (e. g., ControlNet) to generate photo-realistic images.

Instruction Following Text-to-Image Generation

Exploring the Sim2Real Gap Using Digital Twins

no code implementations ICCV 2023 Sruthi Sudhakar, Jon Hanzelka, Josh Bobillot, Tanmay Randhavane, Neel Joshi, Vibhav Vineet

An emerging alternative is to use synthetic data, but if the synthetic data is not similar enough to the real data, the performance is typically below that of training with real data.

Instance Segmentation object-detection +2

PatchBlender: A Motion Prior for Video Transformers

no code implementations11 Nov 2022 Gabriele Prato, Yale Song, Janarthanan Rajendran, R Devon Hjelm, Neel Joshi, Sarath Chandar

We show that our method is successful at enabling vision transformers to encode the temporal component of video data.

Neural-Sim: Learning to Generate Training Data with NeRF

1 code implementation22 Jul 2022 Yunhao Ge, Harkirat Behl, Jiashu Xu, Suriya Gunasekar, Neel Joshi, Yale Song, Xin Wang, Laurent Itti, Vibhav Vineet

However, existing approaches either require human experts to manually tune each scene property or use automatic methods that provide little to no control; this requires rendering large amounts of random data variations, which is slow and is often suboptimal for the target domain.

Object Detection

Scaling Novel Object Detection with Weakly Supervised Detection Transformers

1 code implementation11 Jul 2022 Tyler LaBonte, Yale Song, Xin Wang, Vibhav Vineet, Neel Joshi

A critical object detection task is finetuning an existing model to detect novel objects, but the standard workflow requires bounding box annotations which are time-consuming and expensive to collect.

Multiple Instance Learning Novel Object Detection +4

DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection

no code implementations20 Jun 2022 Yunhao Ge, Jiashu Xu, Brian Nlong Zhao, Neel Joshi, Laurent Itti, Vibhav Vineet

For foreground object mask generation, we use a simple textual template with object class name as input to DALL-E to generate a diverse set of foreground images.

Image Captioning Image Generation +4

Visual Attention Emerges from Recurrent Sparse Reconstruction

1 code implementation23 Apr 2022 Baifeng Shi, Yale Song, Neel Joshi, Trevor Darrell, Xin Wang

We present VARS, Visual Attention from Recurrent Sparse reconstruction, a new attention formulation built on two prominent features of the human visual attention mechanism: recurrency and sparsity.

Robust Contrastive Learning against Noisy Views

1 code implementation CVPR 2022 Ching-Yao Chuang, R Devon Hjelm, Xin Wang, Vibhav Vineet, Neel Joshi, Antonio Torralba, Stefanie Jegelka, Yale Song

Contrastive learning relies on an assumption that positive pairs contain related views, e. g., patches of an image or co-occurring multimodal signals of a video, that share certain underlying information about an instance.

Binary Classification Contrastive Learning

Depth Completion Using a View-constrained Deep Prior

no code implementations21 Jan 2020 Pallabi Ghosh, Vibhav Vineet, Larry S. Davis, Abhinav Shrivastava, Sudipta Sinha, Neel Joshi

Given color images and noisy and incomplete target depth maps, we optimize a randomly-initialized CNN model to reconstruct a depth map restored by virtue of using the CNN network structure as a prior combined with a view-constrained photo-consistency loss.

Depth Completion Image Denoising

A deep active learning system for species identification and counting in camera trap images

1 code implementation22 Oct 2019 Mohammad Sadegh Norouzzadeh, Dan Morris, Sara Beery, Neel Joshi, Nebojsa Jojic, Jeff Clune

However, the accuracy of results depends on the amount, quality, and diversity of the data available to train models, and the literature has focused on projects with millions of relevant, labeled training images.

Active Learning Decision Making +1

Synthetic Examples Improve Generalization for Rare Classes

no code implementations11 Apr 2019 Sara Beery, Yang Liu, Dan Morris, Jim Piavis, Ashish Kapoor, Markus Meister, Neel Joshi, Pietro Perona

The ability to detect and classify rare occurrences in images has important applications - for example, counting rare and endangered species when studying biodiversity, or detecting infrequent traffic scenarios that pose a danger to self-driving cars.

Few-Shot Learning Self-Driving Cars

Submodular Trajectory Optimization for Aerial 3D Scanning

no code implementations ICCV 2017 Mike Roberts, Debadeepta Dey, Anh Truong, Sudipta Sinha, Shital Shah, Ashish Kapoor, Pat Hanrahan, Neel Joshi

Drones equipped with cameras are emerging as a powerful tool for large-scale aerial 3D scanning, but existing automatic flight planners do not exploit all available information about the scene, and can therefore produce inaccurate and incomplete 3D models.

Trajectory Planning

Semantic-driven Generation of Hyperlapse from $360^\circ$ Video

no code implementations31 Mar 2017 Wei-Sheng Lai, Yujia Huang, Neel Joshi, Chris Buehler, Ming-Hsuan Yang, Sing Bing Kang

We present a system for converting a fully panoramic ($360^\circ$) video into a normal field-of-view (NFOV) hyperlapse for an optimal viewing experience.

Video Stabilization

Lens Factory: Automatic Lens Generation Using Off-the-shelf Components

no code implementations30 Jun 2015 Libin Sun, Brian Guenter, Neel Joshi, Patrick Therien, James Hays

Unfortunately, custom lens design is costly (thousands to tens of thousands of dollars), time consuming (10-12 weeks typical lead time), and requires specialized optics design expertise.

Blind Image Quality Assessment using Semi-supervised Rectifier Networks

no code implementations CVPR 2014 Huixuan Tang, Neel Joshi, Ashish Kapoor

The biggest hurdles to these efforts are: 1) the difficulty of generalizing across diverse types of distortions and 2) collecting the enormity of human scored training data that is needed to learn the measure.

Blind Image Quality Assessment Image Quality Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.