Search Results for author: Neel Joshi

Found 21 papers, 6 papers with code

HoloAssist: an Egocentric Human Interaction Dataset for Interactive AI Assistants in the Real World

no code implementations • ICCV 2023 • Xin Wang, Taein Kwon, Mahdi Rad, Bowen Pan, Ishani Chakraborty, Sean Andrist, Dan Bohus, Ashley Feniello, Bugra Tekin, Felipe Vieira Frujeri, Neel Joshi, Marc Pollefeys

Building an interactive AI assistant that can perceive, reason, and collaborate with humans in the real world has been a long-standing pursuit in the AI community.

Mistake Detection Mixed Reality +1

Paper
Add Code

Beyond Generation: Harnessing Text to Image Models for Object Detection and Segmentation

1 code implementation • 12 Sep 2023 • Yunhao Ge, Jiashu Xu, Brian Nlong Zhao, Neel Joshi, Laurent Itti, Vibhav Vineet

A foreground-background segmentation algorithm is then used to generate foreground object masks.

Image Captioning Image Generation +3

Paper
Code

Controllable Text-to-Image Generation with GPT-4

no code implementations • 29 May 2023 • Tianjun Zhang, Yi Zhang, Vibhav Vineet, Neel Joshi, Xin Wang

Control-GPT works by querying GPT-4 to write TikZ code, and the generated sketches are used as references alongside the text instructions for diffusion models (e. g., ControlNet) to generate photo-realistic images.

Instruction Following Text-to-Image Generation

Paper
Add Code

Exploring the Sim2Real Gap Using Digital Twins

no code implementations • ICCV 2023 • Sruthi Sudhakar, Jon Hanzelka, Josh Bobillot, Tanmay Randhavane, Neel Joshi, Vibhav Vineet

An emerging alternative is to use synthetic data, but if the synthetic data is not similar enough to the real data, the performance is typically below that of training with real data.

Instance Segmentation object-detection +2

Paper
Add Code

PatchBlender: A Motion Prior for Video Transformers

no code implementations • 11 Nov 2022 • Gabriele Prato, Yale Song, Janarthanan Rajendran, R Devon Hjelm, Neel Joshi, Sarath Chandar

We show that our method is successful at enabling vision transformers to encode the temporal component of video data.

Paper
Add Code

Neural-Sim: Learning to Generate Training Data with NeRF

1 code implementation • 22 Jul 2022 • Yunhao Ge, Harkirat Behl, Jiashu Xu, Suriya Gunasekar, Neel Joshi, Yale Song, Xin Wang, Laurent Itti, Vibhav Vineet

However, existing approaches either require human experts to manually tune each scene property or use automatic methods that provide little to no control; this requires rendering large amounts of random data variations, which is slow and is often suboptimal for the target domain.

Object Detection

154

Paper
Code

Scaling Novel Object Detection with Weakly Supervised Detection Transformers

1 code implementation • 11 Jul 2022 • Tyler LaBonte, Yale Song, Xin Wang, Vibhav Vineet, Neel Joshi

A critical object detection task is finetuning an existing model to detect novel objects, but the standard workflow requires bounding box annotations which are time-consuming and expensive to collect.

Multiple Instance Learning Novel Object Detection +4

Paper
Code

DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection

no code implementations • 20 Jun 2022 • Yunhao Ge, Jiashu Xu, Brian Nlong Zhao, Neel Joshi, Laurent Itti, Vibhav Vineet

For foreground object mask generation, we use a simple textual template with object class name as input to DALL-E to generate a diverse set of foreground images.

Image Captioning Image Generation +4

Paper
Add Code

Visual Attention Emerges from Recurrent Sparse Reconstruction

1 code implementation • 23 Apr 2022 • Baifeng Shi, Yale Song, Neel Joshi, Trevor Darrell, Xin Wang

We present VARS, Visual Attention from Recurrent Sparse reconstruction, a new attention formulation built on two prominent features of the human visual attention mechanism: recurrency and sparsity.

Paper
Code

One Network Doesn't Rule Them All: Moving Beyond Handcrafted Architectures in Self-Supervised Learning

no code implementations • 15 Mar 2022 • Sharath Girish, Debadeepta Dey, Neel Joshi, Vibhav Vineet, Shital Shah, Caio Cesar Teodoro Mendes, Abhinav Shrivastava, Yale Song

We conduct a large-scale study with over 100 variants of ResNet and MobileNet architectures and evaluate them across 11 downstream scenarios in the SSL setting.

Image Classification Self-Supervised Learning

Paper
Add Code

Robust Contrastive Learning against Noisy Views

1 code implementation • CVPR 2022 • Ching-Yao Chuang, R Devon Hjelm, Xin Wang, Vibhav Vineet, Neel Joshi, Antonio Torralba, Stefanie Jegelka, Yale Song

Contrastive learning relies on an assumption that positive pairs contain related views, e. g., patches of an image or co-occurring multimodal signals of a video, that share certain underlying information about an instance.

Binary Classification Contrastive Learning

Paper
Code

Depth Completion Using a View-constrained Deep Prior

no code implementations • 21 Jan 2020 • Pallabi Ghosh, Vibhav Vineet, Larry S. Davis, Abhinav Shrivastava, Sudipta Sinha, Neel Joshi

Given color images and noisy and incomplete target depth maps, we optimize a randomly-initialized CNN model to reconstruct a depth map restored by virtue of using the CNN network structure as a prior combined with a view-constrained photo-consistency loss.

Depth Completion Image Denoising

Paper
Add Code

A deep active learning system for species identification and counting in camera trap images

1 code implementation • 22 Oct 2019 • Mohammad Sadegh Norouzzadeh, Dan Morris, Sara Beery, Neel Joshi, Nebojsa Jojic, Jeff Clune

However, the accuracy of results depends on the amount, quality, and diversity of the data available to train models, and the literature has focused on projects with millions of relevant, labeled training images.

Active Learning Decision Making +1

667

Paper
Code

Synthetic Examples Improve Generalization for Rare Classes

no code implementations • 11 Apr 2019 • Sara Beery, Yang Liu, Dan Morris, Jim Piavis, Ashish Kapoor, Markus Meister, Neel Joshi, Pietro Perona

The ability to detect and classify rare occurrences in images has important applications - for example, counting rare and endangered species when studying biodiversity, or detecting infrequent traffic scenarios that pose a danger to self-driving cars.

Few-Shot Learning Self-Driving Cars

Paper
Add Code

Learn-to-Score: Efficient 3D Scene Exploration by Predicting View Utility

no code implementations • ECCV 2018 • Benjamin Hepp, Debadeepta Dey, Sudipta N. Sinha, Ashish Kapoor, Neel Joshi, Otmar Hilliges

We propose to learn a better utility function that predicts the usefulness of future viewpoints.

Paper
Add Code

Personalized Cinemagraphs using Semantic Understanding and Collaborative Learning

no code implementations • ICCV 2017 • Tae-Hyun Oh, Kyungdon Joo, Neel Joshi, Baoyuan Wang, In So Kweon, Sing Bing Kang

Cinemagraphs are a compelling way to convey dynamic aspects of a scene.

Object Recognition Semantic Segmentation

Paper
Add Code

Highly curved image sensors: a practical approach for improved optical performance

no code implementations • 20 Jun 2017 • Brian Guenter, Neel Joshi, Richard Stoakley, Andrew Keefe, Kevin Geary, Ryan Freeman, Jake Hundley, Pamela Patterson, David Hammon, Guillermo Herrera, Elena Sherman, Andrew Nowak, Randall Schubert, Peter Brewer, Louis Yang, Russell Mott, Geoff McKnight

In this work we demonstrate that commercial silicon CMOS image sensors can be thinned and formed into accurate, highly curved optical surfaces with undiminished functionality.

Friction

Paper
Add Code

Submodular Trajectory Optimization for Aerial 3D Scanning

no code implementations • ICCV 2017 • Mike Roberts, Debadeepta Dey, Anh Truong, Sudipta Sinha, Shital Shah, Ashish Kapoor, Pat Hanrahan, Neel Joshi

Drones equipped with cameras are emerging as a powerful tool for large-scale aerial 3D scanning, but existing automatic flight planners do not exploit all available information about the scene, and can therefore produce inaccurate and incomplete 3D models.

Trajectory Planning

Paper
Add Code

Semantic-driven Generation of Hyperlapse from $360^\circ$ Video

no code implementations • 31 Mar 2017 • Wei-Sheng Lai, Yujia Huang, Neel Joshi, Chris Buehler, Ming-Hsuan Yang, Sing Bing Kang

We present a system for converting a fully panoramic ($360^\circ$) video into a normal field-of-view (NFOV) hyperlapse for an optimal viewing experience.

Video Stabilization

Paper
Add Code

Lens Factory: Automatic Lens Generation Using Off-the-shelf Components

no code implementations • 30 Jun 2015 • Libin Sun, Brian Guenter, Neel Joshi, Patrick Therien, James Hays

Unfortunately, custom lens design is costly (thousands to tens of thousands of dollars), time consuming (10-12 weeks typical lead time), and requires specialized optics design expertise.

Paper
Add Code

Blind Image Quality Assessment using Semi-supervised Rectifier Networks

no code implementations • CVPR 2014 • Huixuan Tang, Neel Joshi, Ashish Kapoor

The biggest hurdles to these efforts are: 1) the difficulty of generalizing across diverse types of distortions and 2) collecting the enormity of human scored training data that is needed to learn the measure.

Blind Image Quality Assessment Image Quality Estimation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.