Search Results for author: Abhinav Gupta

Found 175 papers, 72 papers with code

Beyond Games: Bringing Exploration to Robots in Real-world

no code implementations • ICLR 2019 • Deepak Pathak, Dhiraj Gandhi, Abhinav Gupta

But most importantly, we are able to implement an exploration policy on a robot which learns to interact with objects completely from scratch just using data collected via the differentiable exploration module.

Efficient Exploration

Paper
Add Code

G-HOP: Generative Hand-Object Prior for Interaction Reconstruction and Grasp Synthesis

no code implementations • 18 Apr 2024 • Yufei Ye, Abhinav Gupta, Kris Kitani, Shubham Tulsiani

We propose G-HOP, a denoising diffusion based generative prior for hand-object interactions that allows modeling both the 3D object and a human hand, conditioned on the object category.

Denoising Object

Paper
Add Code

Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling

1 code implementation • 15 Feb 2024 • Raunaq Bhirangi, Chenyu Wang, Venkatesh Pattabiraman, Carmel Majidi, Abhinav Gupta, Tess Hellebrekers, Lerrel Pinto

Reasoning from sequences of raw sensory data is a ubiquitous problem across fields ranging from medical devices to robotics.

Paper
Code

Towards Generalizable Zero-Shot Manipulation via Translating Human Interaction Plans

no code implementations • 1 Dec 2023 • Homanga Bharadhwaj, Abhinav Gupta, Vikash Kumar, Shubham Tulsiani

We pursue the goal of developing robots that can interact zero-shot with generic unseen objects via a diverse repertoire of manipulation skills and show how passive human videos can serve as a rich source of data for learning such generalist robots.

Robot Manipulation Translation

Paper
Add Code

Exploitation-Guided Exploration for Semantic Embodied Navigation

no code implementations • 6 Nov 2023 • Justin Wasserman, Girish Chowdhary, Abhinav Gupta, Unnat Jain

In the recent progress in embodied navigation and sim-to-robot transfer, modular policies have emerged as a de facto framework.

Benchmarking

Paper
Add Code

An Unbiased Look at Datasets for Visuo-Motor Pre-Training

no code implementations • 13 Oct 2023 • Sudeep Dasari, Mohan Kumar Srirama, Unnat Jain, Abhinav Gupta

Visual representation learning hold great promise for robotics, but is severely hampered by the scarcity and homogeneity of robotics datasets.

Representation Learning

Paper
Add Code

Diffusion-Guided Reconstruction of Everyday Hand-Object Interaction Clips

no code implementations • ICCV 2023 • Yufei Ye, Poorvi Hebbar, Abhinav Gupta, Shubham Tulsiani

We tackle the task of reconstructing hand-object interactions from short video clips.

Object

Paper
Add Code

RoboAgent: Generalization and Efficiency in Robot Manipulation via Semantic Augmentations and Action Chunking

no code implementations • 5 Sep 2023 • Homanga Bharadhwaj, Jay Vakil, Mohit Sharma, Abhinav Gupta, Shubham Tulsiani, Vikash Kumar

The grand aim of having a single robot that can manipulate arbitrary objects in diverse settings is at odds with the paucity of robotics datasets.

Chunking Robot Manipulation

Paper
Add Code

Evaluating Continual Learning on a Home Robot

no code implementations • 4 Jun 2023 • Sam Powers, Abhinav Gupta, Chris Paxton

Robots in home environments need to be able to learn new skills continuously as data becomes available, becoming ever more capable over time while using as little real-world data as possible.

Continual Learning

Paper
Add Code

Train Offline, Test Online: A Real Robot Learning Benchmark

1 code implementation • 1 Jun 2023 • Gaoyue Zhou, Victoria Dean, Mohan Kumar Srirama, Aravind Rajeswaran, Jyothish Pari, Kyle Hatch, Aryan Jain, Tianhe Yu, Pieter Abbeel, Lerrel Pinto, Chelsea Finn, Abhinav Gupta

Three challenges limit the progress of robot learning research: robots are expensive (few labs can participate), everyone uses different robots (findings do not generalize across labs), and we lack internet-scale robotics data.

Paper
Code

Visual Affordance Prediction for Guiding Robot Exploration

no code implementations • 28 May 2023 • Homanga Bharadhwaj, Abhinav Gupta, Shubham Tulsiani

Motivated by the intuitive understanding humans have about the space of possible interactions, and the ease with which they can generalize this understanding to previously unseen scenes, we develop an approach for learning visual affordances for guiding robot exploration.

Paper
Add Code

ArK: Augmented Reality with Knowledge Interactive Emergent Ability

no code implementations • 1 May 2023 • Qiuyuan Huang, Jae Sung Park, Abhinav Gupta, Paul Bennett, Ran Gong, Subhojit Som, Baolin Peng, Owais Khan Mohammed, Chris Pal, Yejin Choi, Jianfeng Gao

In this study, we develop an infinite agent that learns to transfer knowledge memory from general foundation models (e. g. GPT4, DALLE) to novel domains or scenarios for scene understanding and generation in the physical or virtual world.

Mixed Reality Scene Generation +1

Paper
Add Code

Affordance Diffusion: Synthesizing Hand-Object Interactions

no code implementations • CVPR 2023 • Yufei Ye, Xueting Li, Abhinav Gupta, Shalini De Mello, Stan Birchfield, Jiaming Song, Shubham Tulsiani, Sifei Liu

In contrast, in this work we focus on synthesizing complex interactions (ie, an articulated hand) with a given object.

Descriptive Image Generation +1

Paper
Add Code

Zero-Shot Robot Manipulation from Passive Human Videos

no code implementations • 3 Feb 2023 • Homanga Bharadhwaj, Abhinav Gupta, Shubham Tulsiani, Vikash Kumar

Can we learn robot manipulation for everyday tasks, only by watching videos of humans doing arbitrary tasks in different unstructured settings?

Robot Manipulation

Paper
Add Code

Generalized Neural Closure Models with Interpretability

1 code implementation • 15 Jan 2023 • Abhinav Gupta, Pierre F. J. Lermusiaux

Improving the predictive capability and computational cost of dynamical models is often at the heart of augmenting computational physics with machine learning (ML).

Paper
Code

Self-Activating Neural Ensembles for Continual Reinforcement Learning

1 code implementation • 31 Dec 2022 • Sam Powers, Eliot Xing, Abhinav Gupta

The ability for an agent to continuously learn new skills without catastrophically forgetting existing knowledge is of critical importance for the development of generally intelligent agents.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Last-Mile Embodied Visual Navigation

1 code implementation • 21 Nov 2022 • Justin Wasserman, Karmesh Yadav, Girish Chowdhary, Abhinav Gupta, Unnat Jain

Realistic long-horizon tasks like image-goal navigation involve exploratory and exploitative phases.

Visual Navigation

Paper
Code

Bayesian Learning of Coupled Biogeochemical-Physical Models

no code implementations • 12 Nov 2022 • Abhinav Gupta, Pierre F. J. Lermusiaux

We develop a Bayesian model learning methodology that allows interpolation in the space of candidate models and discovery of new models from noisy, sparse, and indirect observations, all while estimating state fields and parameter values, as well as the joint PDFs of all learned quantities.

Paper
Add Code

All the Feels: A dexterous hand with large-area tactile sensing

no code implementations • 27 Oct 2022 • Raunaq Bhirangi, Abigail DeFranco, Jacob Adkins, Carmel Majidi, Abhinav Gupta, Tess Hellebrekers, Vikash Kumar

High cost and lack of reliability has precluded the widespread adoption of dexterous hands in robotics.

Paper
Add Code

Real World Offline Reinforcement Learning with Realistic Data Source

no code implementations • 12 Oct 2022 • Gaoyue Zhou, Liyiming Ke, Siddhartha Srinivasa, Abhinav Gupta, Aravind Rajeswaran, Vikash Kumar

Offline reinforcement learning (ORL) holds great promise for robot learning due to its ability to learn from arbitrary pre-generated experience.

Imitation Learning reinforcement-learning +1

Paper
Add Code

Learning State-Aware Visual Representations from Audible Interactions

1 code implementation • 27 Sep 2022 • Himangi Mittal, Pedro Morgado, Unnat Jain, Abhinav Gupta

However, learning representations from videos can be challenging.

Ranked #1 on Long Term Action Anticipation on Ego4D (ED@20 Noun metric)

Action Anticipation Action Recognition +3

Paper
Code

Learning Dexterous Manipulation from Exemplar Object Trajectories and Pre-Grasps

1 code implementation • 22 Sep 2022 • Sudeep Dasari, Abhinav Gupta, Vikash Kumar

This paper seeks to escape these constraints, by developing a Pre-Grasp informed Dexterous Manipulation (PGDM) framework that generates diverse dexterous manipulation behaviors, without any task-specific reasoning or hyper-parameter tuning.

Efficient Exploration

Paper
Code

Human-to-Robot Imitation in the Wild

no code implementations • 19 Jul 2022 • Shikhar Bahl, Abhinav Gupta, Deepak Pathak

We approach the problem of learning by watching humans in the wild.

Paper
Add Code

Can Foundation Models Perform Zero-Shot Task Specification For Robot Manipulation?

no code implementations • 23 Apr 2022 • Yuchen Cui, Scott Niekum, Abhinav Gupta, Vikash Kumar, Aravind Rajeswaran

Task specification is at the core of programming autonomous robots.

Robot Manipulation Scene Understanding

Paper
Add Code

What's in your hands? 3D Reconstruction of Generic Objects in Hands

1 code implementation • CVPR 2022 • Yufei Ye, Abhinav Gupta, Shubham Tulsiani

Our work aims to reconstruct hand-held objects given a single RGB image.

3D Pose Estimation 3D Reconstruction +2

108

Paper
Code

Pre-train, Self-train, Distill: A simple recipe for Supersizing 3D Reconstruction

no code implementations • CVPR 2022 • Kalyan Vasudev Alwala, Abhinav Gupta, Shubham Tulsiani

Our final 3D reconstruction model is also capable of zero-shot inference on images from unseen object categories and we empirically show that increasing the number of training categories improves the reconstruction quality.

3D Reconstruction Single-View 3D Reconstruction

Paper
Add Code

R3M: A Universal Visual Representation for Robot Manipulation

1 code implementation • 23 Mar 2022 • Suraj Nair, Aravind Rajeswaran, Vikash Kumar, Chelsea Finn, Abhinav Gupta

We study how visual representations pre-trained on diverse human video data can enable data-efficient learning of downstream robotic manipulation tasks.

Contrastive Learning Robot Manipulation

245

Paper
Code

The Challenges of Continuous Self-Supervised Learning

no code implementations • 23 Mar 2022 • Senthil Purushwalkam, Pedro Morgado, Abhinav Gupta

As a result, SSL holds the promise to learn representations from data in-the-wild, i. e., without the need for finite and static datasets.

Representation Learning Self-Supervised Learning

Paper
Add Code

The Unsurprising Effectiveness of Pre-Trained Vision Models for Control

no code implementations • 7 Mar 2022 • Simone Parisi, Aravind Rajeswaran, Senthil Purushwalkam, Abhinav Gupta

In this context, we revisit and study the role of pre-trained visual representations for control, and in particular representations trained on large-scale computer vision datasets.

Paper
Add Code

Interesting Object, Curious Agent: Learning Task-Agnostic Exploration

1 code implementation • NeurIPS 2021 • Simone Parisi, Victoria Dean, Deepak Pathak, Abhinav Gupta

In this setup, the agent first learns to explore across many environments without any extrinsic goal in a task-agnostic manner.

Object

Paper
Code

A Differentiable Recipe for Learning Visual Non-Prehensile Planar Manipulation

1 code implementation • 9 Nov 2021 • Bernardo Aceituno, Alberto Rodriguez, Shubham Tulsiani, Abhinav Gupta, Mustafa Mukadam

Specifying tasks with videos is a powerful technique towards acquiring novel and general robot skills.

Contact mechanics

Paper
Code

ReSkin: versatile, replaceable, lasting tactile skins

1 code implementation • 29 Oct 2021 • Raunaq Bhirangi, Tess Hellebrekers, Carmel Majidi, Abhinav Gupta

Soft sensors have continued growing interest in robotics, due to their ability to enable both passive conformal contact from the material properties and active contact data from the sensor properties.

BIG-bench Machine Learning Self-Supervised Learning

Paper
Code

Dynamic population-based meta-learning for multi-agent communication with natural language

no code implementations • NeurIPS 2021 • Abhinav Gupta, Marc Lanctot, Angeliki Lazaridou

In this work, our goal is to train agents that can coordinate with seen, unseen as well as human partners in a multi-agent communication environment involving natural language.

Attribute Meta-Learning +1

Paper
Add Code

CORA: Benchmarks, Baselines, and Metrics as a Platform for Continual Reinforcement Learning Agents

2 code implementations • 19 Oct 2021 • Sam Powers, Eliot Xing, Eric Kolve, Roozbeh Mottaghi, Abhinav Gupta

In this work, we present CORA, a platform for Continual Reinforcement Learning Agents that provides benchmarks, baselines, and metrics in a single code package.

NetHack reinforcement-learning +1

448

Paper
Code

No RL, No Simulation: Learning to Navigate without Navigating

1 code implementation • NeurIPS 2021 • Meera Hahn, Devendra Chaplot, Shubham Tulsiani, Mustafa Mukadam, James M. Rehg, Abhinav Gupta

Most prior methods for learning navigation policies require access to simulation environments, as they need online policy interaction and rely on ground-truth maps for rewards.

Navigate Reinforcement Learning (RL)

Paper
Code

Learning Multi-Objective Curricula for Robotic Policy Learning

1 code implementation • 6 Oct 2021 • Jikun Kang, Miao Liu, Abhinav Gupta, Chris Pal, Xue Liu, Jie Fu

Various automatic curriculum learning (ACL) methods have been proposed to improve the sample efficiency and final performance of deep reinforcement learning (DRL).

Reinforcement Learning (RL)

Paper
Code

The Functional Correspondence Problem

no code implementations • ICCV 2021 • Zihang Lai, Senthil Purushwalkam, Abhinav Gupta

For example, what are the correspondences between a bottle and shoe for the task of pounding or the task of pouring.

Paper
Add Code

Wanderlust: Online Continual Object Detection in the Real World

1 code implementation • ICCV 2021 • Jianren Wang, Xin Wang, Yue Shang-Guan, Abhinav Gupta

To bridge the gap, we present a new online continual object detection benchmark with an egocentric video dataset, Objects Around Krishna (OAK).

Continual Learning Object +2

Paper
Code

Hierarchical Neural Dynamic Policies

no code implementations • 12 Jul 2021 • Shikhar Bahl, Abhinav Gupta, Deepak Pathak

We tackle the problem of generalization to unseen configurations for dynamic tasks in the real world while learning from high-dimensional image input.

Paper
Add Code

Digital-Twin-Based Improvements to Diagnosis, Prognosis, Strategy Assessment, and Discrepancy Checking in a Nearly Autonomous Management and Control System

no code implementations • 23 May 2021 • Linyu Lin, Paridhi Athe, Pascal Rouxelin, Maria Avramova, Abhinav Gupta, Robert Youngblood, Nam Dinh

The Nearly Autonomous Management and Control System (NAMAC) is a comprehensive control system that assists plant operations by furnishing control recommendations to operators in a broad class of situations.

Attribute BIG-bench Machine Learning +2

Paper
Add Code

PixelTransformer: Sample Conditioned Signal Generation

no code implementations • 29 Mar 2021 • Shubham Tulsiani, Abhinav Gupta

We propose a generative model that can infer a distribution for the underlying spatial signal conditioned on sparse samples e. g. plausible images given a few observed pixels.

Paper
Add Code

Learn-to-Race: A Multimodal Control Environment for Autonomous Racing

1 code implementation • ICCV 2021 • James Herman, Jonathan Francis, Siddha Ganju, Bingqing Chen, Anirudh Koul, Abhinav Gupta, Alexey Skabelkin, Ivan Zhukov, Max Kumskoy, Eric Nyberg

Existing research on autonomous driving primarily focuses on urban driving, which is insufficient for characterising the complex driving behaviour underlying high-speed racing.

Autonomous Driving Trajectory Prediction

131

Paper
Code

Meta Learning for Multi-agent Communication

no code implementations • ICLR Workshop Learning_to_Learn 2021 • Abhinav Gupta, Angeliki Lazaridou, Marc Lanctot

Recent works have shown remarkable progress in training artificial agents to understand natural language but are focused on using large amounts of raw data involving huge compute requirements.

Meta-Learning Meta Reinforcement Learning

Paper
Add Code

Shelf-Supervised Mesh Prediction in the Wild

1 code implementation • CVPR 2021 • Yufei Ye, Shubham Tulsiani, Abhinav Gupta

We first infer a volumetric representation in a canonical frame, along with the camera pose.

Paper
Code

droidlet: modular, heterogenous, multi-modal agents

1 code implementation • 25 Jan 2021 • Anurag Pratik, Soumith Chintala, Kavya Srinet, Dhiraj Gandhi, Rebecca Qian, Yuxuan Sun, Ryan Drew, Sara Elkafrawy, Anoushka Tiwari, Tucker Hart, Mary Williamson, Abhinav Gupta, Arthur Szlam

In recent years, there have been significant advances in building end-to-end Machine Learning (ML) systems that learn at scale.

829

Paper
Code

Where2Act: From Pixels to Actions for Articulated 3D Objects

1 code implementation • ICCV 2021 • Kaichun Mo, Leonidas Guibas, Mustafa Mukadam, Abhinav Gupta, Shubham Tulsiani

One of the fundamental goals of visual perception is to allow agents to meaningfully interact with their environment.

Paper
Code

Audio-Visual Floorplan Reconstruction

1 code implementation • ICCV 2021 • Senthil Purushwalkam, Sebastian Vicenc Amengual Gari, Vamsi Krishna Ithapu, Carl Schissler, Philip Robinson, Abhinav Gupta, Kristen Grauman

Given only a few glimpses of an environment, how much can we infer about its entire floorplan?

Paper
Code

Neural Closure Models for Dynamical Systems

1 code implementation • 27 Dec 2020 • Abhinav Gupta, Pierre F. J. Lermusiaux

The new "neural closure models" augment low-fidelity models with neural delay differential equations (nDDEs), motivated by the Mori-Zwanzig formulation and the inherent delays in complex dynamical systems.

Paper
Code

KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA

no code implementations • CVPR 2021 • Kenneth Marino, Xinlei Chen, Devi Parikh, Abhinav Gupta, Marcus Rohrbach

One of the most challenging question types in VQA is when answering the question requires outside knowledge not present in the image.

Ranked #6 on Visual Question Answering (VQA) on A-OKVQA

Visual Question Answering (VQA)

Paper
Add Code

A 55-line code for large-scale parallel topology optimization in 2D and 3D

1 code implementation • 15 Dec 2020 • Abhinav Gupta, Rajib Chowdhury, Anupam Chakrabarti, Timon Rabczuk

This paper presents a 55-line code written in python for 2D and 3D topology optimization (TO) based on the open-source finite element computing software (FEniCS), equipped with various finite element tools and solvers.

Mathematical Software Computational Engineering, Finance, and Science Optimization and Control

Paper
Code

Neural Dynamic Policies for End-to-End Sensorimotor Learning

no code implementations • NeurIPS 2020 • Shikhar Bahl, Mustafa Mukadam, Abhinav Gupta, Deepak Pathak

We show that NDPs outperform the prior state-of-the-art in terms of either efficiency or performance across several robotic control tasks for both imitation and reinforcement learning setups.

Ranked #4 on Meta-Learning on MT50

Imitation Learning reinforcement-learning +1

Paper
Add Code

Same Object, Different Grasps: Data and Semantic Knowledge for Task-Oriented Grasping

1 code implementation • 12 Nov 2020 • Adithyavairavan Murali, Weiyu Liu, Kenneth Marino, Sonia Chernova, Abhinav Gupta

This is largely due to the scale of the datasets both in terms of the number of objects and tasks studied.

Robotic Grasping

Paper
Code

Transformers for One-Shot Visual Imitation

no code implementations • 11 Nov 2020 • Sudeep Dasari, Abhinav Gupta

Humans are able to seamlessly visually imitate others, by inferring their intentions and using past experience to achieve the same end goal.

Imitation Learning

Paper
Add Code

Ask Your Humans: Using Human Instructions to Improve Generalization in Reinforcement Learning

1 code implementation • ICLR 2021 • Valerie Chen, Abhinav Gupta, Kenneth Marino

We also find that incorporating natural language allows the model to generalize to unseen tasks in a zero-shot setting and to learn quickly from a few demonstrations.

Multi-Task Learning reinforcement-learning +1

Paper
Code

Development and Assessment of a Nearly Autonomous Management and Control System for Advanced Reactors

no code implementations • 29 Aug 2020 • Linyu Lin, Paridhi Athe, Pascal Rouxelin, Robert Youngblood, Abhinav Gupta, Jeffrey Lane, Maria Avramova, Nam Dinh

This paper develops a Nearly Autonomous Management and Control (NAMAC) system for advanced reactors.

Management

Paper
Add Code

Visual Imitation Made Easy

no code implementations • 11 Aug 2020 • Sarah Young, Dhiraj Gandhi, Shubham Tulsiani, Abhinav Gupta, Pieter Abbeel, Lerrel Pinto

We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector.

Imitation Learning

Paper
Add Code

Demystifying Contrastive Self-Supervised Learning: Invariances, Augmentations and Dataset Biases

no code implementations • NeurIPS 2020 • Senthil Purushwalkam, Abhinav Gupta

Second, we demonstrate that these approaches obtain further gains from access to a clean object-centric training dataset like Imagenet.

Classification General Classification +8

Paper
Add Code

Implicit Mesh Reconstruction from Unannotated Image Collections

no code implementations • 16 Jul 2020 • Shubham Tulsiani, Nilesh Kulkarni, Abhinav Gupta

We present an approach to infer the 3D shape, texture, and camera pose for an object from a single RGB image, using only category-level image collections with foreground masks as supervision.

Paper
Add Code

Aligning Videos in Space and Time

no code implementations • ECCV 2020 • Senthil Purushwalkam, Tian Ye, Saurabh Gupta, Abhinav Gupta

During training, given a pair of videos, we compute cycles that connect patches in a given frame in the first video by matching through frames in the second video.

Paper
Add Code

See, Hear, Explore: Curiosity via Audio-Visual Association

1 code implementation • NeurIPS 2020 • Victoria Dean, Shubham Tulsiani, Abhinav Gupta

Exploration is one of the core challenges in reinforcement learning.

Efficient Exploration

Paper
Code

Swoosh! Rattle! Thump! -- Actions that Sound

no code implementations • 3 Jul 2020 • Dhiraj Gandhi, Abhinav Gupta, Lerrel Pinto

In this work, we perform the first large-scale study of the interactions between sound and robotic action.

Paper
Add Code

Object Goal Navigation using Goal-Oriented Semantic Exploration

2 code implementations • NeurIPS 2020 • Devendra Singh Chaplot, Dhiraj Gandhi, Abhinav Gupta, Ruslan Salakhutdinov

We propose a modular system called, `Goal-Oriented Semantic Exploration' which builds an episodic semantic map and uses it to explore the environment efficiently based on the goal object category.

Ranked #4 on Robot Navigation on Habitat 2020 Object Nav test-std

Object Robot Navigation

292

Paper
Code

Compositionality and Capacity in Emergent Languages

no code implementations • WS 2020 • Abhinav Gupta, Cinjon Resnick, Jakob Foerster, Andrew Dai, Kyunghyun Cho

Our hypothesis is that there should be a specific range of model capacity and channel bandwidth that induces compositional structure in the resulting language and consequently encourages systematic generalization.

Open-Ended Question Answering Systematic Generalization

Paper
Add Code

Empirically Verifying Hypotheses Using Reinforcement Learning

no code implementations • 29 Jun 2020 • Kenneth Marino, Rob Fergus, Arthur Szlam, Abhinav Gupta

This paper formulates hypothesis verification as an RL problem.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Learning Robot Skills with Temporal Variational Inference

no code implementations • ICML 2020 • Tanmay Shankar, Abhinav Gupta

In this paper, we address the discovery of robotic options from demonstrations in an unsupervised manner.

Variational Inference

Paper
Add Code

Semantic Curiosity for Active Visual Learning

no code implementations • ECCV 2020 • Devendra Singh Chaplot, Helen Jiang, Saurabh Gupta, Abhinav Gupta

Instead, we explore a self-supervised approach for training our exploration policy by introducing a notion of semantic curiosity.

Object object-detection +1

Paper
Add Code

Neural Topological SLAM for Visual Navigation

no code implementations • CVPR 2020 • Devendra Singh Chaplot, Ruslan Salakhutdinov, Abhinav Gupta, Saurabh Gupta

This paper studies the problem of image-goal navigation which involves navigating to the location indicated by a goal image in a novel previously unseen environment.

Visual Navigation

Paper
Add Code

Learning to Explore using Active Neural SLAM

2 code implementations • ICLR 2020 • Devendra Singh Chaplot, Dhiraj Gandhi, Saurabh Gupta, Abhinav Gupta, Ruslan Salakhutdinov

The use of learning provides flexibility with respect to input modalities (in the SLAM module), leverages structural regularities of the world (in global policies), and provides robustness to errors in state estimation (in local policies).

PointGoal Navigation

712

Paper
Code

Articulation-aware Canonical Surface Mapping

1 code implementation • CVPR 2020 • Nilesh Kulkarni, Abhinav Gupta, David F. Fouhey, Shubham Tulsiani

We tackle the tasks of: 1) predicting a Canonical Surface Mapping (CSM) that indicates the mapping from 2D pixels to corresponding points on a canonical template shape, and 2) inferring the articulation and pose of the template corresponding to the input image.

Paper
Code

Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects

3 code implementations • CVPR 2020 • Kiana Ehsani, Shubham Tulsiani, Saurabh Gupta, Ali Farhadi, Abhinav Gupta

Our quantitative and qualitative results show that (a) we can predict meaningful forces from videos whose effects lead to accurate imitation of the motions observed, (b) by jointly optimizing for contact point and force prediction, we can improve the performance on both tasks in comparison to independent training, and (c) we can learn a representation from this model that generalizes to novel objects using few shot examples.

Human-Object Interaction Detection

Paper
Code

Beyond the Camera: Neural Networks in World Coordinates

no code implementations • 12 Mar 2020 • Gunnar A. Sigurdsson, Abhinav Gupta, Cordelia Schmid, Karteek Alahari

Eye movement and strategic placement of the visual field onto the retina, gives animals increased resolution of the scene and suppresses distracting information.

Action Recognition Video Stabilization +1

Paper
Add Code

Intrinsic Motivation for Encouraging Synergistic Behavior

no code implementations • ICLR 2020 • Rohan Chitnis, Shubham Tulsiani, Saurabh Gupta, Abhinav Gupta

Our key idea is that a good guiding principle for intrinsic motivation in synergistic tasks is to take actions which affect the world in ways that would not be achieved if the agents were acting on their own.

Paper
Add Code

On the interaction between supervision and self-play in emergent communication

1 code implementation • ICLR 2020 • Ryan Lowe, Abhinav Gupta, Jakob Foerster, Douwe Kiela, Joelle Pineau

A promising approach for teaching artificial agents to use natural language involves using human-in-the-loop training.

Paper
Code

Structural Inductive Biases in Emergent Communication

no code implementations • 4 Feb 2020 • Agnieszka Słowik, Abhinav Gupta, William L. Hamilton, Mateja Jamnik, Sean B. Holden, Christopher Pal

In order to communicate, humans flatten a complex representation of ideas and their attributes into a single word or a sentence.

Representation Learning Sentence

Paper
Add Code

Towards Graph Representation Learning in Emergent Communication

no code implementations • 24 Jan 2020 • Agnieszka Słowik, Abhinav Gupta, William L. Hamilton, Mateja Jamnik, Sean B. Holden

Recent findings in neuroscience suggest that the human brain represents information in a geometric structure (for instance, through conceptual spaces).

Graph Representation Learning Sentence

Paper
Add Code

Discovering Motor Programs by Recomposing Demonstrations

no code implementations • ICLR 2020 • Tanmay Shankar, Shubham Tulsiani, Lerrel Pinto, Abhinav Gupta

In this paper, we present an approach to learn recomposable motor primitives across large-scale and diverse manipulation demonstrations.

Hierarchical Reinforcement Learning

Paper
Add Code

ClusterFit: Improving Generalization of Visual Representations

1 code implementation • CVPR 2020 • Xueting Yan, Ishan Misra, Abhinav Gupta, Deepti Ghadiyaram, Dhruv Mahajan

Pre-training convolutional neural networks with weakly-supervised and self-supervised strategies is becoming increasingly popular for several computer vision tasks.

Ranked #53 on Image Classification on iNaturalist 2018

Action Classification Clustering +2

3,229

Paper
Code

Third-Person Visual Imitation Learning via Decoupled Hierarchical Controller

1 code implementation • NeurIPS 2019 • Pratyusha Sharma, Deepak Pathak, Abhinav Gupta

We study a generalized setup for learning from demonstration to build an agent that can manipulate novel objects in unseen scenarios by looking at only a single video of human demonstration from a third-person perspective.

Imitation Learning

Paper
Code

Seeded self-play for language learning

no code implementations • WS 2019 • Abhinav Gupta, Ryan Lowe, Jakob Foerster, Douwe Kiela, Joelle Pineau

Once the meta-learning agent is able to quickly adapt to each population of agents, it can be deployed in new populations, including populations speaking human language.

Imitation Learning Meta-Learning

Paper
Add Code

Capacity, Bandwidth, and Compositionality in Emergent Language Learning

1 code implementation • 24 Oct 2019 • Cinjon Resnick, Abhinav Gupta, Jakob Foerster, Andrew M. Dai, Kyunghyun Cho

In this paper, we investigate the learning biases that affect the efficacy and compositionality of emergent languages.

Open-Ended Question Answering Systematic Generalization

Paper
Code

Object-centric Forward Modeling for Model Predictive Control

1 code implementation • 8 Oct 2019 • Yufei Ye, Dhiraj Gandhi, Abhinav Gupta, Shubham Tulsiani

We present an approach to learn an object-centric forward model, and show that this allows us to plan for sequences of actions to achieve distant desired goals.

Model Predictive Control Object

Paper
Code

Efficient Bimanual Manipulation Using Learned Task Schemas

no code implementations • 30 Sep 2019 • Rohan Chitnis, Shubham Tulsiani, Saurabh Gupta, Abhinav Gupta

Our insight is that for many tasks, the learning process can be decomposed into learning a state-independent task schema (a sequence of skills to execute) and a policy to choose the parameterizations of the skills in a state-dependent manner.

Paper
Add Code

Agent as Scientist: Learning to Verify Hypotheses

no code implementations • 25 Sep 2019 • Kenneth Marino, Rob Fergus, Arthur Szlam, Abhinav Gupta

In order to train the agents, we exploit the underlying structure in the majority of hypotheses -- they can be formulated as triplets (pre-condition, action sequence, post-condition).

Paper
Add Code

Swoosh! Rattle! Thump! - Actions that Sound

no code implementations • 25 Sep 2019 • Dhiraj Gandhi, Abhinav Gupta, Lerrel Pinto

In this work, we perform the first large-scale study of the interactions between sound and robotic action.

Object

Paper
Add Code

Dynamics-aware Embeddings

2 code implementations • ICLR 2020 • William Whitney, Rajat Agarwal, Kyunghyun Cho, Abhinav Gupta

In this paper we consider self-supervised representation learning to improve sample efficiency in reinforcement learning (RL).

Continuous Control reinforcement-learning +2

Paper
Code

Compositional Video Prediction

2 code implementations • ICCV 2019 • Yufei Ye, Maneesh Singh, Abhinav Gupta, Shubham Tulsiani

We present an approach for pixel-level future prediction given an input image of a scene.

Future prediction Video Prediction

Paper
Code

Environment Probing Interaction Policies

1 code implementation • ICLR 2019 • Wenxuan Zhou, Lerrel Pinto, Abhinav Gupta

A key challenge in reinforcement learning (RL) is environment generalization: a policy trained to solve a task in one environment often fails to solve the same task in a slightly different test environment.

Reinforcement Learning (RL)

Paper
Code

Canonical Surface Mapping via Geometric Cycle Consistency

1 code implementation • ICCV 2019 • Nilesh Kulkarni, Abhinav Gupta, Shubham Tulsiani

We explore the task of Canonical Surface Mapping (CSM).

185

Paper
Code

PyRobot: An Open-source Robotics Framework for Research and Benchmarking

2 code implementations • 19 Jun 2019 • Adithyavairavan Murali, Tao Chen, Kalyan Vasudev Alwala, Dhiraj Gandhi, Lerrel Pinto, Saurabh Gupta, Abhinav Gupta

This paper introduces PyRobot, an open-source robotics framework for research and benchmarking.

Benchmarking Robotic Grasping +2

2,203

Paper
Code

Self-Supervised Exploration via Disagreement

2 code implementations • 10 Jun 2019 • Deepak Pathak, Dhiraj Gandhi, Abhinav Gupta

In this paper, we propose a formulation for exploration inspired by the work in active learning literature.

Ranked #1 on Unsupervised Reinforcement Learning on URLB (states, 10^5 frames)

Active Learning Efficient Exploration +2

318

Paper
Code

3D-RelNet: Joint Object and Relational Network for 3D Prediction

no code implementations • ICCV 2019 • Nilesh Kulkarni, Ishan Misra, Shubham Tulsiani, Abhinav Gupta

We propose an approach to predict the 3D shape and pose for the objects present in a scene.

Object

Paper
Add Code

Task-Driven Modular Networks for Zero-Shot Compositional Learning

1 code implementation • ICCV 2019 • Senthil Purushwalkam, Maximilian Nickel, Abhinav Gupta, Marc'Aurelio Ranzato

When extending the evaluation to the generalized setting which accounts also for pairs seen during training, we discover that naive baseline methods perform similarly or better than current approaches.

Attribute Novel Concepts +1

Paper
Code

Scaling and Benchmarking Self-Supervised Visual Representation Learning

2 code implementations • ICCV 2019 • Priya Goyal, Dhruv Mahajan, Abhinav Gupta, Ishan Misra

Self-supervised learning aims to learn representations from the data itself without explicit manual supervision.

Benchmarking object-detection +5

589

Paper
Code

Hierarchical RL Using an Ensemble of Proprioceptive Periodic Policies

no code implementations • ICLR 2019 • Kenneth Marino, Abhinav Gupta, Rob Fergus, Arthur Szlam

The high-level policy is trained using a sparse, task-dependent reward, and operates by choosing which of the low-level policies to run at any given time.

Paper
Add Code

Bounce and Learn: Modeling Scene Dynamics with Real-World Bounces

no code implementations • ICLR 2019 • Senthil Purushwalkam, Abhinav Gupta, Danny M. Kaufman, Bryan Russell

To achieve our results, we introduce the Bounce Dataset comprising 5K RGB-D videos of bouncing trajectories of a foam ball to probe surfaces of varying shapes and materials in everyday scenes including homes and offices.

Paper
Add Code

Learning Exploration Policies for Navigation

2 code implementations • ICLR 2019 • Tao Chen, Saurabh Gupta, Abhinav Gupta

Numerous past works have tackled the problem of task-driven navigation.

Efficient Exploration General Reinforcement Learning +2

Paper
Code

Beyond Grids: Learning Graph Representations for Visual Recognition

no code implementations • NeurIPS 2018 • Yin Li, Abhinav Gupta

Our method further learns to propagate information across all vertices on the graph, and is able to project the learned graph representation back into 2D grids.

Instance Segmentation object-detection +3

Paper
Add Code

Hardware Conditioned Policies for Multi-Robot Transfer Learning

1 code implementation • NeurIPS 2018 • Tao Chen, Adithyavairavan Murali, Abhinav Gupta

In tasks where knowing the agent dynamics is important for success, we learn an embedding for robot hardware and show that policies conditioned on the encoding of hardware tend to generalize and transfer well.

Industrial Robots Transfer Reinforcement Learning

Paper
Code

Multiple Interactions Made Easy (MIME): Large Scale Demonstrations Data for Imitation

no code implementations • 16 Oct 2018 • Pratyusha Sharma, Lekha Mohan, Lerrel Pinto, Abhinav Gupta

In order to make progress and capture the space of manipulation, we would need to collect a large-scale dataset of diverse tasks such as pouring, opening bottles, stacking objects etc.

Trajectory Prediction

Paper
Add Code

Visual Semantic Navigation using Scene Priors

1 code implementation • ICLR 2019 • Wei Yang, Xiaolong Wang, Ali Farhadi, Abhinav Gupta, Roozbeh Mottaghi

Do we use the semantic/functional priors we have built over years to efficiently search and navigate?

Navigate

Paper
Code

BOLD5000: A public fMRI dataset of 5000 images

3 code implementations • 5 Sep 2018 • Nadine Chang, John A. Pyles, Abhinav Gupta, Michael J. Tarr, Elissa M. Aminoff

Vision science, particularly machine vision, has been revolutionized by introducing large-scale image datasets and statistical learning approaches.

Scene Understanding

Paper
Code

Compositional Learning for Human Object Interaction

no code implementations • ECCV 2018 • Keizo Kato, Yin Li, Abhinav Gupta

The world of human-object interactions is rich.

Human-Object Interaction Detection Object +1

Paper
Add Code

Interpretable Intuitive Physics Model

no code implementations • ECCV 2018 • Tian Ye, Xiaolong Wang, James Davidson, Abhinav Gupta

In order to demonstrate that our system models these underlying physical properties, we train our model on collisions of different shapes (cube, cone, cylinder, spheres etc.)

Friction

Paper
Add Code

Robot Learning in Homes: Improving Generalization and Reducing Dataset Bias

no code implementations • NeurIPS 2018 • Abhinav Gupta, Adithyavairavan Murali, Dhiraj Gandhi, Lerrel Pinto

The models trained with our home dataset showed a marked improvement of 43. 7% over a baseline model trained with data collected in lab.

Robotic Grasping

Paper
Add Code

Videos as Space-Time Region Graphs

no code implementations • ECCV 2018 • Xiaolong Wang, Abhinav Gupta

These nodes are connected by two types of relations: (i) similarity relations capturing the long range dependencies between correlated objects and (ii) spatial-temporal relations capturing the interactions between nearby objects.

Ranked #34 on Action Classification on Charades (using extra training data)

Action Classification Action Recognition

Paper
Add Code

Learning to Grasp Without Seeing

no code implementations • 10 May 2018 • Adithyavairavan Murali, Yin Li, Dhiraj Gandhi, Abhinav Gupta

We believe this is the first attempt at learning to grasp with only tactile sensing and without any prior object knowledge.

Object Localization

Paper
Add Code

Charades-Ego: A Large-Scale Dataset of Paired Third and First Person Videos

no code implementations • 25 Apr 2018 • Gunnar A. Sigurdsson, Abhinav Gupta, Cordelia Schmid, Ali Farhadi, Karteek Alahari

In this paper we describe the egocentric aspect of the dataset and present annotations for Charades-Ego with 68, 536 activity instances in 68. 8 hours of first and third-person video, making it one of the largest and most diverse egocentric datasets available.

General Classification Video Classification +1

Paper
Add Code

Actor and Observer: Joint Modeling of First and Third-Person Videos

1 code implementation • CVPR 2018 • Gunnar A. Sigurdsson, Abhinav Gupta, Cordelia Schmid, Ali Farhadi, Karteek Alahari

Several theories in cognitive neuroscience suggest that when people interact with the world, or simulate interactions, they do so from a first-person egocentric perspective, and seamlessly transfer knowledge between third-person (observer) and first-person (actor).

Action Recognition Temporal Action Localization

Paper
Code

Binge Watching: Scaling Affordance Learning from Sitcoms

no code implementations • CVPR 2017 • Xiaolong Wang, Rohit Girdhar, Abhinav Gupta

In this paper, we tackle the challenge of creating one of the biggest dataset for learning affordances.

Paper
Add Code

Iterative Visual Reasoning Beyond Convolutions

no code implementations • CVPR 2018 • Xinlei Chen, Li-Jia Li, Li Fei-Fei, Abhinav Gupta

The framework consists of two core modules: a local module that uses spatial memory to store previous beliefs with parallel updates; and a global graph-reasoning module.

Visual Reasoning

Paper
Add Code

Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs

3 code implementations • CVPR 2018 • Xiaolong Wang, Yufei Ye, Abhinav Gupta

Given a learned knowledge graph (KG), our approach takes as input semantic embeddings for each node (representing visual category).

Knowledge Graphs Zero-Shot Learning

912

Paper
Code

AI2-THOR: An Interactive 3D Environment for Visual AI

2 code implementations • 14 Dec 2017 • Eric Kolve, Roozbeh Mottaghi, Winson Han, Eli VanderBilt, Luca Weihs, Alvaro Herrasti, Matt Deitke, Kiana Ehsani, Daniel Gordon, Yuke Zhu, Aniruddha Kembhavi, Abhinav Gupta, Ali Farhadi

We introduce The House Of inteRactions (THOR), a framework for visual AI research, available at http://ai2thor. allenai. org.

Imitation Learning Navigate +7

1,018

Paper
Code

Learning by Asking Questions

no code implementations • CVPR 2018 • Ishan Misra, Ross Girshick, Rob Fergus, Martial Hebert, Abhinav Gupta, Laurens van der Maaten

We also show that our model asks questions that generalize to state-of-the-art VQA models and to novel test time distributions.

Question Answering Visual Question Answering

Paper
Add Code

Sentiment Classification using Images and Label Embeddings

no code implementations • 3 Dec 2017 • Laura Graesser, Abhinav Gupta, Lakshay Sharma, Evelina Bakhturina

In this project we analysed how much semantic information images carry, and how much value image data can add to sentiment analysis of the text associated with the images.

Classification General Classification +2

Paper
Add Code

Visual Features for Context-Aware Speech Recognition

no code implementations • 1 Dec 2017 • Abhinav Gupta, Yajie Miao, Leonardo Neves, Florian Metze

We are working on a corpus of "how-to" videos from the web, and the idea is that an object that can be seen ("car"), or a scene that is being detected ("kitchen") can be used to condition both models on the "context" of the recording, thereby reducing perplexity and improving transcription.

Language Modelling speech-recognition +1

Paper
Add Code

Non-local Neural Networks

31 code implementations • CVPR 2018 • Xiaolong Wang, Ross Girshick, Abhinav Gupta, Kaiming He

Both convolutional and recurrent operations are building blocks that process one local neighborhood at a time.

Ranked #8 on Action Classification on Toyota Smarthome dataset (using extra training data)

Action Classification Action Recognition +5

26,137

Paper
Code

Learning 6-DOF Grasping Interaction via Deep Geometry-aware 3D Representations

1 code implementation • 24 Aug 2017 • Xinchen Yan, Jasmine Hsu, Mohi Khansari, Yunfei Bai, Arkanath Pathak, Abhinav Gupta, James Davidson, Honglak Lee

Our contributions are fourfold: (1) To best of our knowledge, we are presenting for the first time a method to learn a 6-DOF grasping net from RGBD input; (2) We build a grasping dataset from demonstrations in virtual reality with rich sensory and interaction annotations.

3D Geometry Prediction 3D Shape Modeling +1

Paper
Code

What Actions are Needed for Understanding Human Actions in Videos?

1 code implementation • ICCV 2017 • Gunnar A. Sigurdsson, Olga Russakovsky, Abhinav Gupta

We present the many kinds of information that will be needed to achieve substantial gains in activity understanding: objects, verbs, intent, and sequential reasoning.

Benchmarking

Paper
Code

Transitive Invariance for Self-supervised Visual Representation Learning

no code implementations • ICCV 2017 • Xiaolong Wang, Kaiming He, Abhinav Gupta

The objects are connected by two types of edges which correspond to two types of invariance: "different instances but a similar viewpoint and category" and "different viewpoints of the same instance".

Multi-Task Learning object-detection +4

Paper
Add Code

CASSL: Curriculum Accelerated Self-Supervised Learning

no code implementations • 4 Aug 2017 • Adithyavairavan Murali, Lerrel Pinto, Dhiraj Gandhi, Abhinav Gupta

Recent self-supervised learning approaches focus on using a few thousand data points to learn policies for high-level, low-dimensional action spaces.

Self-Supervised Learning

Paper
Add Code

Combining Keystroke Dynamics and Face Recognition for User Verification

no code implementations • 2 Aug 2017 • Abhinav Gupta, Agrim Khanna, Anmol Jagetia, Devansh Sharma, Sanchit Alekh, Vaibhav Choudhary

Keystroke Dynamics is a novel Biometric Technique; it is not only unobtrusive, but also transparent and inexpensive.

Face Recognition

Paper
Add Code

Temporal Dynamic Graph LSTM for Action-driven Video Object Detection

no code implementations • ICCV 2017 • Yuan Yuan, Xiaodan Liang, Xiaolong Wang, Dit-yan Yeung, Abhinav Gupta

A common issue, however, is that objects of interest that are not involved in human actions are often absent in global action descriptions known as "missing label".

Ranked #3 on Weakly Supervised Object Detection on Charades

Object object-detection +3

Paper
Add Code

Revisiting Unreasonable Effectiveness of Data in Deep Learning Era

2 code implementations • ICCV 2017 • Chen Sun, Abhinav Shrivastava, Saurabh Singh, Abhinav Gupta

What will happen if we increase the dataset size by 10x or 100x?

Ranked #2 on Semantic Segmentation on PASCAL VOC 2007

Image Classification object-detection +4

3,048

Paper
Code

From Red Wine to Red Tomato: Composition With Context

no code implementations • CVPR 2017 • Ishan Misra, Abhinav Gupta, Martial Hebert

In this paper, we present a simple method that respects contextuality in order to compose classifiers of known visual concepts.

Paper
Add Code

Visual Semantic Planning using Deep Successor Representations

no code implementations • ICCV 2017 • Yuke Zhu, Daniel Gordon, Eric Kolve, Dieter Fox, Li Fei-Fei, Abhinav Gupta, Roozbeh Mottaghi, Ali Farhadi

A crucial capability of real-world intelligent agents is their ability to plan a sequence of actions to achieve their goals in the visual world.

Imitation Learning Reinforcement Learning (RL)

Paper
Add Code

WebVision Challenge: Visual Learning and Understanding With Web Data

no code implementations • 16 May 2017 • Wen Li, Li-Min Wang, Wei Li, Eirikur Agustsson, Jesse Berent, Abhinav Gupta, Rahul Sukthankar, Luc van Gool

The 2017 WebVision challenge consists of two tracks, the image classification task on WebVision test set, and the transfer learning task on PASCAL VOC 2012 dataset.

Benchmarking Image Classification +1

Paper
Add Code

The Pose Knows: Video Forecasting by Generating Pose Futures

1 code implementation • ICCV 2017 • Jacob Walker, Kenneth Marino, Abhinav Gupta, Martial Hebert

First we explicitly model the high level structure of active objects in the scene---humans---and use a VAE to model the possible future movements of humans in the pose space.

Ranked #2 on Human Pose Forecasting on Human3.6M (CMD metric)

Human Pose Forecasting Video Prediction

Paper
Code

Learning to Fly by Crashing

1 code implementation • 19 Apr 2017 • Dhiraj Gandhi, Lerrel Pinto, Abhinav Gupta

An alternative is to use simulation.

Navigate

Paper
Code

Spatial Memory for Context Reasoning in Object Detection

36 code implementations • ICCV 2017 • Xinlei Chen, Abhinav Gupta

On the other hand, modeling object-object relationships requires {\bf spatial} reasoning -- not only do we need a memory to store the spatial layout, but also a effective reasoning module to extract spatial patterns.

Object Object Detection

3,642

Paper
Code

What's in a Question: Using Visual Questions as a Form of Supervision

1 code implementation • CVPR 2017 • Siddha Ganju, Olga Russakovsky, Abhinav Gupta

For instance, the question "what is the breed of the dog?"

Data Augmentation Visual Question Answering

Paper
Code

A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection

4 code implementations • CVPR 2017 • Xiaolong Wang, Abhinav Shrivastava, Abhinav Gupta

We propose to learn an adversarial network that generates examples with occlusions and deformations.

Ranked #20 on Object Detection on PASCAL VOC 2007 (using extra training data)

Object object-detection +1

480

Paper
Code

ActionVLAD: Learning spatio-temporal aggregation for action classification

no code implementations • CVPR 2017 • Rohit Girdhar, Deva Ramanan, Abhinav Gupta, Josef Sivic, Bryan Russell

In this work, we introduce a new video representation for action classification that aggregates local convolutional features across the entire spatio-temporal extent of the video.

Ranked #8 on Long-video Activity Recognition on Breakfast

Action Classification Classification +3

Paper
Add Code

Robust Adversarial Reinforcement Learning

6 code implementations • ICML 2017 • Lerrel Pinto, James Davidson, Rahul Sukthankar, Abhinav Gupta

Deep neural networks coupled with fast simulation and improved computation have led to recent successes in the field of reinforcement learning (RL).

Friction reinforcement-learning +1

Paper
Code

PixelNet: Representation of the pixels, by the pixels, and for the pixels

1 code implementation • 21 Feb 2017 • Aayush Bansal, Xinlei Chen, Bryan Russell, Abhinav Gupta, Deva Ramanan

We explore design principles for general pixel-level prediction problems, from low-level edge detection to mid-level surface normal estimation to high-level semantic segmentation.

Edge Detection Segmentation +2

Paper
Code

An Implementation of Faster RCNN with Study for Region Sampling

48 code implementations • 7 Feb 2017 • Xinlei Chen, Abhinav Gupta

We adapted the join-training scheme of Faster RCNN framework from Caffe to TensorFlow as a baseline implementation for object detection.

General Classification Object Detection

3,642

Paper
Code

Learning From Noisy Large-Scale Datasets With Minimal Supervision

no code implementations • CVPR 2017 • Andreas Veit, Neil Alldrin, Gal Chechik, Ivan Krasin, Abhinav Gupta, Serge Belongie

For the small clean set of annotations we use a quarter of the validation set with ~40k images.

Paper
Add Code

Beyond Skip Connections: Top-Down Modulation for Object Detection

1 code implementation • 20 Dec 2016 • Abhinav Shrivastava, Rahul Sukthankar, Jitendra Malik, Abhinav Gupta

But most of these fine details are lost in the early convolutional layers.

Ranked #203 on Object Detection on COCO test-dev

Object object-detection +1

111

Paper
Code

From Images to 3D Shape Attributes

no code implementations • 20 Dec 2016 • David F. Fouhey, Abhinav Gupta, Andrew Zisserman

Our first objective is to infer these 3D shape attributes from a single image.

Paper
Add Code

Asynchronous Temporal Fields for Action Recognition

2 code implementations • CVPR 2017 • Gunnar A. Sigurdsson, Santosh Divvala, Ali Farhadi, Abhinav Gupta

Actions are more than just movements and trajectories: we cook to eat and we hold a cup to drink from it.

Ranked #16 on Action Detection on Charades

Action Classification Action Recognition +3

201

Paper
Code

The More You Know: Using Knowledge Graphs for Image Classification

no code implementations • CVPR 2017 • Kenneth Marino, Ruslan Salakhutdinov, Abhinav Gupta

One characteristic that sets humans apart from modern learning-based computer vision algorithms is the ability to acquire knowledge about the world and use that knowledge to reason about the visual world.

Classification General Classification +3

Paper
Add Code

Supervision via Competition: Robot Adversaries for Learning Tasks

1 code implementation • 5 Oct 2016 • Lerrel Pinto, James Davidson, Abhinav Gupta

Due to large number of experiences required for training, most of these approaches use a self-supervised paradigm: using sensors to measure success/failure.

Paper
Code

Learning to Push by Grasping: Using multiple tasks for effective learning

no code implementations • 28 Sep 2016 • Lerrel Pinto, Abhinav Gupta

The argument of the difficulty in scalability to multiple tasks is well founded, since training these tasks often require hundreds or thousands of examples.

Multi-Task Learning

Paper
Add Code

PixelNet: Towards a General Pixel-level Architecture

no code implementations • 21 Sep 2016 • Aayush Bansal, Xinlei Chen, Bryan Russell, Abhinav Gupta, Deva Ramanan

We explore architectures for general pixel-level prediction problems, from low-level edge detection to mid-level surface normal estimation to high-level semantic segmentation.

Edge Detection Semantic Segmentation +1

Paper
Add Code

Pose from Action: Unsupervised Learning of Pose Features based on Motion

no code implementations • 18 Sep 2016 • Senthil Purushwalkam, Abhinav Gupta

We propose an unsupervised method to learn pose features from videos that exploits a signal which is complementary to appearance and can be used as supervision: motion.

Action Recognition In Videos Optical Flow Estimation +2

Paper
Add Code

Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning

2 code implementations • 16 Sep 2016 • Yuke Zhu, Roozbeh Mottaghi, Eric Kolve, Joseph J. Lim, Abhinav Gupta, Li Fei-Fei, Ali Farhadi

To address the second issue, we propose AI2-THOR framework, which provides an environment with high-quality 3D scenes and physics engine.

3D Reconstruction Feature Engineering +3

Paper
Code

Much Ado About Time: Exhaustive Annotation of Temporal Data

no code implementations • 25 Jul 2016 • Gunnar A. Sigurdsson, Olga Russakovsky, Ali Farhadi, Ivan Laptev, Abhinav Gupta

We conclude that the optimal strategy is to ask as many questions as possible in a HIT (up to 52 binary questions after watching a 30-second video clip in our experiments).

Paper
Add Code

An Uncertain Future: Forecasting from Static Images using Variational Autoencoders

no code implementations • 25 Jun 2016 • Jacob Walker, Carl Doersch, Abhinav Gupta, Martial Hebert

We show that our method is able to successfully predict events in a wide variety of scenes and can produce multiple different predictions when the future is ambiguous.

Paper
Add Code

3D Shape Attributes

no code implementations • CVPR 2016 • David F. Fouhey, Abhinav Gupta, Andrew Zisserman

In this paper we investigate 3D attributes as a means to understand the shape of an object in a single image.

Object

Paper
Add Code

Learning Visual Storylines with Skipping Recurrent Neural Networks

1 code implementation • 14 Apr 2016 • Gunnar A. Sigurdsson, Xinlei Chen, Abhinav Gupta

What does a typical visit to Paris look like?

Paper
Code

Training Region-based Object Detectors with Online Hard Example Mining

5 code implementations • CVPR 2016 • Abhinav Shrivastava, Abhinav Gupta, Ross Girshick

Our motivation is the same as it has always been -- detection datasets contain an overwhelming number of easy examples and a small number of hard examples.

Ranked #6 on Face Verification on Trillion Pairs Dataset

object-detection Object Detection

421

Paper
Code

Cross-stitch Networks for Multi-task Learning

1 code implementation • CVPR 2016 • Ishan Misra, Abhinav Shrivastava, Abhinav Gupta, Martial Hebert

In this paper, we propose a principled approach to learn shared representations in ConvNets using multi-task learning.

Ranked #106 on Semantic Segmentation on NYU Depth v2

Multi-Task Learning Semantic Segmentation

121

Paper
Code

Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding

no code implementations • 6 Apr 2016 • Gunnar A. Sigurdsson, Gül Varol, Xiaolong Wang, Ali Farhadi, Ivan Laptev, Abhinav Gupta

Each video is annotated by multiple free-text descriptions, action labels, action intervals and classes of interacted objects.

Action Recognition Temporal Action Localization

Paper
Add Code

The Curious Robot: Learning Visual Representations via Physical Interactions

no code implementations • 5 Apr 2016 • Lerrel Pinto, Dhiraj Gandhi, Yuanfeng Han, Yong-Lae Park, Abhinav Gupta

We argue that biological agents use physical interactions with the world to learn visual representations unlike current vision systems which just use passive observations (images and videos downloaded from web).

Image Classification Representation Learning +1

Paper
Add Code

Marr Revisited: 2D-3D Alignment via Surface Normal Prediction

no code implementations • CVPR 2016 • Aayush Bansal, Bryan Russell, Abhinav Gupta

We introduce an approach that leverages surface normal predictions, along with appearance cues, to retrieve 3D models for objects depicted in 2D still images from a large CAD object library.

Object Pose Prediction +1

Paper
Add Code

Learning a Predictable and Generative Vector Representation for Objects

2 code implementations • 29 Mar 2016 • Rohit Girdhar, David F. Fouhey, Mikel Rodriguez, Abhinav Gupta

The network consists of two components: (a) an autoencoder that ensures the representation is generative; and (b) a convolutional network that ensures the representation is predictable.

Retrieval

Paper
Code

Generative Image Modeling using Style and Structure Adversarial Networks

no code implementations • 17 Mar 2016 • Xiaolong Wang, Abhinav Gupta

Current generative frameworks use end-to-end learning and generate images by sampling from uniform noise distribution.

Generative Adversarial Network Image Generation

Paper
Add Code

"What happens if..." Learning to Predict the Effect of Forces in Images

no code implementations • 17 Mar 2016 • Roozbeh Mottaghi, Mohammad Rastegari, Abhinav Gupta, Ali Farhadi

To build a dataset of forces in scenes, we reconstructed all images in SUN RGB-D dataset in a physics simulator to estimate the physical movements of objects caused by external forces applied to them.

Paper
Add Code

Actions ~ Transformations

1 code implementation • CVPR 2016 • Xiaolong Wang, Ali Farhadi, Abhinav Gupta

In this paper, we propose a novel representation for actions by modeling an action as a transformation which changes the state of the environment before the action happens (precondition) to the state after the action (effect).

Action Recognition Temporal Action Localization

Paper
Code

Single Image 3D Without a Single 3D Image

no code implementations • ICCV 2015 • David F. Fouhey, Wajahat Hussain, Abhinav Gupta, Martial Hebert

Do we really need 3D labels in order to learn how to predict 3D?

Scene Understanding

Paper
Add Code

Supersizing Self-supervision: Learning to Grasp from 50K Tries and 700 Robot Hours

no code implementations • 23 Sep 2015 • Lerrel Pinto, Abhinav Gupta

Our experiments clearly show the benefit of using large-scale datasets (and multi-stage training) for the task of grasping.

Binary Classification

Paper
Add Code

Sense Discovery via Co-Clustering on Images and Text

no code implementations • CVPR 2015 • Xinlei Chen, Alan Ritter, Abhinav Gupta, Tom Mitchell

We present a co-clustering framework that can be used to discover multiple semantic and visual senses of a given Noun Phrase (NP).

Clustering

Paper
Add Code

Unsupervised Visual Representation Learning by Context Prediction

3 code implementations • ICCV 2015 • Carl Doersch, Abhinav Gupta, Alexei A. Efros

This work explores the use of spatial context as a source of free and plentiful supervisory signal for training a rich visual representation.

Representation Learning

3,081

Paper
Code

Webly Supervised Learning of Convolutional Networks

no code implementations • ICCV 2015 • Xinlei Chen, Abhinav Gupta

Specifically inspired by curriculum learning, we present a two-step approach for CNN training.

Image Retrieval

Paper
Add Code

In Defense of the Direct Perception of Affordances

no code implementations • 5 May 2015 • David F. Fouhey, Xiaolong Wang, Abhinav Gupta

The field of functional recognition or affordance estimation from images has seen a revival in recent years.

Paper
Add Code

Unsupervised Learning of Visual Representations using Videos

no code implementations • ICCV 2015 • Xiaolong Wang, Abhinav Gupta

Is strong supervision necessary for learning a good visual representation?

Surface Normal Estimation Visual Tracking

Paper
Add Code

Dense Optical Flow Prediction from a Static Image

no code implementations • ICCV 2015 • Jacob Walker, Abhinav Gupta, Martial Hebert

Because our CNN model makes no assumptions about the underlying scene, it can predict future optical flow on a diverse set of scenarios.

motion prediction Optical Flow Estimation

Paper
Add Code

Mid-level Elements for Object Detection

no code implementations • 27 Apr 2015 • Aayush Bansal, Abhinav Shrivastava, Carl Doersch, Abhinav Gupta

Building on the success of recent discriminative mid-level elements, we propose a surprisingly simple approach for object detection which performs comparable to the current state-of-the-art approaches on PASCAL VOC comp-3 detection challenge (no external data).

Object object-detection +1

Paper
Add Code

Transferring Rich Feature Hierarchies for Robust Visual Tracking

no code implementations • 19 Jan 2015 • Naiyan Wang, Siyi Li, Abhinav Gupta, Dit-yan Yeung

To fit the characteristics of object tracking, we first pre-train the CNN to recognize what is an object, and then propose to generate a probability map instead of producing a simple class label.

Image Classification Object +4

Paper
Add Code

Designing Deep Networks for Surface Normal Estimation

no code implementations • CVPR 2015 • Xiaolong Wang, David F. Fouhey, Abhinav Gupta

We show by incorporating several constraints (man-made, manhattan world) and meaningful intermediate representations (room layout, edge labels) in the architecture leads to state of the art performance on surface normal estimation.

Scene Understanding Surface Normal Estimation

Paper
Add Code

Enriching Visual Knowledge Bases via Object Discovery and Segmentation

no code implementations • CVPR 2014 • Xinlei Chen, Abhinav Shrivastava, Abhinav Gupta

In this paper, we propose to enrich these knowledge bases by automatically discovering objects and their segmentations from noisy Internet images.

Object Discovery Segmentation

Paper
Add Code

Patch to the Future: Unsupervised Visual Prediction

no code implementations • CVPR 2014 • Jacob Walker, Abhinav Gupta, Martial Hebert

In this paper we present a conceptually simple but surprisingly powerful method for visual prediction which combines the effectiveness of mid-level visual elements with temporal modeling.

Hallucination

Paper
Add Code

Mid-level Visual Element Discovery as Discriminative Mode Seeking

no code implementations • NeurIPS 2013 • Carl Doersch, Abhinav Gupta, Alexei A. Efros

We also propose the Purity-Coverage plot as a principled way of experimentally analyzing and evaluating different visual discovery approaches, and compare our method against prior work on the Paris Street View dataset.

Scene Classification

Paper
Add Code

Representing Videos Using Mid-level Discriminative Patches

no code implementations • CVPR 2013 • Arpit Jain, Abhinav Gupta, Mikel Rodriguez, Larry S. Davis

representation for videos based on mid-level discriminative spatio-temporal patches.

Action Classification General Classification

Paper
Add Code

Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces

no code implementations • NeurIPS 2010 • Abhinav Gupta, Martial Hebert, Takeo Kanade, David M. Blei

There has been a recent push in extraction of 3D spatial layout of scenes.

Structured Prediction

Paper
Add Code

A ``Shape Aware'' Model for semi-supervised Learning of Objects and its Context

no code implementations • NeurIPS 2008 • Abhinav Gupta, Jianbo Shi, Larry S. Davis

Using an analogous reasoning, we present an approach that combines bag-of-words and spatial models to perform semantic and syntactic analysis for recognition of an object based on its internal appearance and its context.

Object Object Recognition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.