no code implementations • ICLR 2019 • Deepak Pathak, Dhiraj Gandhi, Abhinav Gupta
But most importantly, we are able to implement an exploration policy on a robot which learns to interact with objects completely from scratch just using data collected via the differentiable exploration module.
no code implementations • 11 Sep 2023 • Yufei Ye, Poorvi Hebbar, Abhinav Gupta, Shubham Tulsiani
We tackle the task of reconstructing hand-object interactions from short video clips.
no code implementations • 5 Sep 2023 • Homanga Bharadhwaj, Jay Vakil, Mohit Sharma, Abhinav Gupta, Shubham Tulsiani, Vikash Kumar
The grand aim of having a single robot that can manipulate arbitrary objects in diverse settings is at odds with the paucity of robotics datasets.
no code implementations • 4 Jun 2023 • Sam Powers, Abhinav Gupta, Chris Paxton
Robots in home environments need to be able to learn new skills continuously as data becomes available, becoming ever more capable over time while using as little real-world data as possible.
1 code implementation • 1 Jun 2023 • Gaoyue Zhou, Victoria Dean, Mohan Kumar Srirama, Aravind Rajeswaran, Jyothish Pari, Kyle Hatch, Aryan Jain, Tianhe Yu, Pieter Abbeel, Lerrel Pinto, Chelsea Finn, Abhinav Gupta
Three challenges limit the progress of robot learning research: robots are expensive (few labs can participate), everyone uses different robots (findings do not generalize across labs), and we lack internet-scale robotics data.
no code implementations • 28 May 2023 • Homanga Bharadhwaj, Abhinav Gupta, Shubham Tulsiani
Motivated by the intuitive understanding humans have about the space of possible interactions, and the ease with which they can generalize this understanding to previously unseen scenes, we develop an approach for learning visual affordances for guiding robot exploration.
no code implementations • 1 May 2023 • Qiuyuan Huang, Jae Sung Park, Abhinav Gupta, Paul Bennett, Ran Gong, Subhojit Som, Baolin Peng, Owais Khan Mohammed, Chris Pal, Yejin Choi, Jianfeng Gao
In this study, we develop an infinite agent that learns to transfer knowledge memory from general foundation models (e. g. GPT4, DALLE) to novel domains or scenarios for scene understanding and generation in the physical or virtual world.
no code implementations • CVPR 2023 • Yufei Ye, Xueting Li, Abhinav Gupta, Shalini De Mello, Stan Birchfield, Jiaming Song, Shubham Tulsiani, Sifei Liu
In contrast, in this work we focus on synthesizing complex interactions (ie, an articulated hand) with a given object.
no code implementations • 3 Feb 2023 • Homanga Bharadhwaj, Abhinav Gupta, Shubham Tulsiani, Vikash Kumar
Can we learn robot manipulation for everyday tasks, only by watching videos of humans doing arbitrary tasks in different unstructured settings?
1 code implementation • 15 Jan 2023 • Abhinav Gupta, Pierre F. J. Lermusiaux
Improving the predictive capability and computational cost of dynamical models is often at the heart of augmenting computational physics with machine learning (ML).
1 code implementation • 31 Dec 2022 • Sam Powers, Eliot Xing, Abhinav Gupta
The ability for an agent to continuously learn new skills without catastrophically forgetting existing knowledge is of critical importance for the development of generally intelligent agents.
1 code implementation • 21 Nov 2022 • Justin Wasserman, Karmesh Yadav, Girish Chowdhary, Abhinav Gupta, Unnat Jain
Realistic long-horizon tasks like image-goal navigation involve exploratory and exploitative phases.
no code implementations • 12 Nov 2022 • Abhinav Gupta, Pierre F. J. Lermusiaux
We develop a Bayesian model learning methodology that allows interpolation in the space of candidate models and discovery of new models from noisy, sparse, and indirect observations, all while estimating state fields and parameter values, as well as the joint PDFs of all learned quantities.
no code implementations • 27 Oct 2022 • Raunaq Bhirangi, Abigail DeFranco, Jacob Adkins, Carmel Majidi, Abhinav Gupta, Tess Hellebrekers, Vikash Kumar
High cost and lack of reliability has precluded the widespread adoption of dexterous hands in robotics.
no code implementations • 12 Oct 2022 • Gaoyue Zhou, Liyiming Ke, Siddhartha Srinivasa, Abhinav Gupta, Aravind Rajeswaran, Vikash Kumar
Offline reinforcement learning (ORL) holds great promise for robot learning due to its ability to learn from arbitrary pre-generated experience.
1 code implementation • 27 Sep 2022 • Himangi Mittal, Pedro Morgado, Unnat Jain, Abhinav Gupta
However, learning representations from videos can be challenging.
Ranked #1 on
Long Term Action Anticipation
on Ego4D
(ED@20 Noun metric)
1 code implementation • 22 Sep 2022 • Sudeep Dasari, Abhinav Gupta, Vikash Kumar
This paper seeks to escape these constraints, by developing a Pre-Grasp informed Dexterous Manipulation (PGDM) framework that generates diverse dexterous manipulation behaviors, without any task-specific reasoning or hyper-parameter tuning.
no code implementations • 19 Jul 2022 • Shikhar Bahl, Abhinav Gupta, Deepak Pathak
We approach the problem of learning by watching humans in the wild.
no code implementations • 23 Apr 2022 • Yuchen Cui, Scott Niekum, Abhinav Gupta, Vikash Kumar, Aravind Rajeswaran
Task specification is at the core of programming autonomous robots.
1 code implementation • CVPR 2022 • Yufei Ye, Abhinav Gupta, Shubham Tulsiani
Our work aims to reconstruct hand-held objects given a single RGB image.
no code implementations • CVPR 2022 • Kalyan Vasudev Alwala, Abhinav Gupta, Shubham Tulsiani
Our final 3D reconstruction model is also capable of zero-shot inference on images from unseen object categories and we empirically show that increasing the number of training categories improves the reconstruction quality.
no code implementations • 23 Mar 2022 • Senthil Purushwalkam, Pedro Morgado, Abhinav Gupta
As a result, SSL holds the promise to learn representations from data in-the-wild, i. e., without the need for finite and static datasets.
1 code implementation • 23 Mar 2022 • Suraj Nair, Aravind Rajeswaran, Vikash Kumar, Chelsea Finn, Abhinav Gupta
We study how visual representations pre-trained on diverse human video data can enable data-efficient learning of downstream robotic manipulation tasks.
no code implementations • 7 Mar 2022 • Simone Parisi, Aravind Rajeswaran, Senthil Purushwalkam, Abhinav Gupta
In this context, we revisit and study the role of pre-trained visual representations for control, and in particular representations trained on large-scale computer vision datasets.
1 code implementation • NeurIPS 2021 • Simone Parisi, Victoria Dean, Deepak Pathak, Abhinav Gupta
In this setup, the agent first learns to explore across many environments without any extrinsic goal in a task-agnostic manner.
1 code implementation • 9 Nov 2021 • Bernardo Aceituno, Alberto Rodriguez, Shubham Tulsiani, Abhinav Gupta, Mustafa Mukadam
Specifying tasks with videos is a powerful technique towards acquiring novel and general robot skills.
1 code implementation • 29 Oct 2021 • Raunaq Bhirangi, Tess Hellebrekers, Carmel Majidi, Abhinav Gupta
Soft sensors have continued growing interest in robotics, due to their ability to enable both passive conformal contact from the material properties and active contact data from the sensor properties.
no code implementations • NeurIPS 2021 • Abhinav Gupta, Marc Lanctot, Angeliki Lazaridou
In this work, our goal is to train agents that can coordinate with seen, unseen as well as human partners in a multi-agent communication environment involving natural language.
2 code implementations • 19 Oct 2021 • Sam Powers, Eliot Xing, Eric Kolve, Roozbeh Mottaghi, Abhinav Gupta
In this work, we present CORA, a platform for Continual Reinforcement Learning Agents that provides benchmarks, baselines, and metrics in a single code package.
1 code implementation • NeurIPS 2021 • Meera Hahn, Devendra Chaplot, Shubham Tulsiani, Mustafa Mukadam, James M. Rehg, Abhinav Gupta
Most prior methods for learning navigation policies require access to simulation environments, as they need online policy interaction and rely on ground-truth maps for rewards.
1 code implementation • 6 Oct 2021 • Jikun Kang, Miao Liu, Abhinav Gupta, Chris Pal, Xue Liu, Jie Fu
Various automatic curriculum learning (ACL) methods have been proposed to improve the sample efficiency and final performance of deep reinforcement learning (DRL).
no code implementations • ICCV 2021 • Zihang Lai, Senthil Purushwalkam, Abhinav Gupta
For example, what are the correspondences between a bottle and shoe for the task of pounding or the task of pouring.
1 code implementation • ICCV 2021 • Jianren Wang, Xin Wang, Yue Shang-Guan, Abhinav Gupta
To bridge the gap, we present a new online continual object detection benchmark with an egocentric video dataset, Objects Around Krishna (OAK).
no code implementations • 12 Jul 2021 • Shikhar Bahl, Abhinav Gupta, Deepak Pathak
We tackle the problem of generalization to unseen configurations for dynamic tasks in the real world while learning from high-dimensional image input.
no code implementations • 23 May 2021 • Linyu Lin, Paridhi Athe, Pascal Rouxelin, Maria Avramova, Abhinav Gupta, Robert Youngblood, Nam Dinh
The Nearly Autonomous Management and Control System (NAMAC) is a comprehensive control system that assists plant operations by furnishing control recommendations to operators in a broad class of situations.
no code implementations • 29 Mar 2021 • Shubham Tulsiani, Abhinav Gupta
We propose a generative model that can infer a distribution for the underlying spatial signal conditioned on sparse samples e. g. plausible images given a few observed pixels.
1 code implementation • ICCV 2021 • James Herman, Jonathan Francis, Siddha Ganju, Bingqing Chen, Anirudh Koul, Abhinav Gupta, Alexey Skabelkin, Ivan Zhukov, Max Kumskoy, Eric Nyberg
Existing research on autonomous driving primarily focuses on urban driving, which is insufficient for characterising the complex driving behaviour underlying high-speed racing.
no code implementations • ICLR Workshop Learning_to_Learn 2021 • Abhinav Gupta, Angeliki Lazaridou, Marc Lanctot
Recent works have shown remarkable progress in training artificial agents to understand natural language but are focused on using large amounts of raw data involving huge compute requirements.
1 code implementation • CVPR 2021 • Yufei Ye, Shubham Tulsiani, Abhinav Gupta
We first infer a volumetric representation in a canonical frame, along with the camera pose.
1 code implementation • 25 Jan 2021 • Anurag Pratik, Soumith Chintala, Kavya Srinet, Dhiraj Gandhi, Rebecca Qian, Yuxuan Sun, Ryan Drew, Sara Elkafrawy, Anoushka Tiwari, Tucker Hart, Mary Williamson, Abhinav Gupta, Arthur Szlam
In recent years, there have been significant advances in building end-to-end Machine Learning (ML) systems that learn at scale.
1 code implementation • ICCV 2021 • Kaichun Mo, Leonidas Guibas, Mustafa Mukadam, Abhinav Gupta, Shubham Tulsiani
One of the fundamental goals of visual perception is to allow agents to meaningfully interact with their environment.
1 code implementation • ICCV 2021 • Senthil Purushwalkam, Sebastian Vicenc Amengual Gari, Vamsi Krishna Ithapu, Carl Schissler, Philip Robinson, Abhinav Gupta, Kristen Grauman
Given only a few glimpses of an environment, how much can we infer about its entire floorplan?
1 code implementation • 27 Dec 2020 • Abhinav Gupta, Pierre F. J. Lermusiaux
The new "neural closure models" augment low-fidelity models with neural delay differential equations (nDDEs), motivated by the Mori-Zwanzig formulation and the inherent delays in complex dynamical systems.
no code implementations • CVPR 2021 • Kenneth Marino, Xinlei Chen, Devi Parikh, Abhinav Gupta, Marcus Rohrbach
One of the most challenging question types in VQA is when answering the question requires outside knowledge not present in the image.
Ranked #4 on
Visual Question Answering (VQA)
on A-OKVQA
1 code implementation • 15 Dec 2020 • Abhinav Gupta, Rajib Chowdhury, Anupam Chakrabarti, Timon Rabczuk
This paper presents a 55-line code written in python for 2D and 3D topology optimization (TO) based on the open-source finite element computing software (FEniCS), equipped with various finite element tools and solvers.
Mathematical Software Computational Engineering, Finance, and Science Optimization and Control
no code implementations • NeurIPS 2020 • Shikhar Bahl, Mustafa Mukadam, Abhinav Gupta, Deepak Pathak
We show that NDPs outperform the prior state-of-the-art in terms of either efficiency or performance across several robotic control tasks for both imitation and reinforcement learning setups.
Ranked #4 on
Meta-Learning
on MT50
1 code implementation • 12 Nov 2020 • Adithyavairavan Murali, Weiyu Liu, Kenneth Marino, Sonia Chernova, Abhinav Gupta
This is largely due to the scale of the datasets both in terms of the number of objects and tasks studied.
no code implementations • 11 Nov 2020 • Sudeep Dasari, Abhinav Gupta
Humans are able to seamlessly visually imitate others, by inferring their intentions and using past experience to achieve the same end goal.
1 code implementation • ICLR 2021 • Valerie Chen, Abhinav Gupta, Kenneth Marino
We also find that incorporating natural language allows the model to generalize to unseen tasks in a zero-shot setting and to learn quickly from a few demonstrations.
no code implementations • 29 Aug 2020 • Linyu Lin, Paridhi Athe, Pascal Rouxelin, Robert Youngblood, Abhinav Gupta, Jeffrey Lane, Maria Avramova, Nam Dinh
This paper develops a Nearly Autonomous Management and Control (NAMAC) system for advanced reactors.
no code implementations • 11 Aug 2020 • Sarah Young, Dhiraj Gandhi, Shubham Tulsiani, Abhinav Gupta, Pieter Abbeel, Lerrel Pinto
We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector.
no code implementations • NeurIPS 2020 • Senthil Purushwalkam, Abhinav Gupta
Second, we demonstrate that these approaches obtain further gains from access to a clean object-centric training dataset like Imagenet.
no code implementations • 16 Jul 2020 • Shubham Tulsiani, Nilesh Kulkarni, Abhinav Gupta
We present an approach to infer the 3D shape, texture, and camera pose for an object from a single RGB image, using only category-level image collections with foreground masks as supervision.
no code implementations • ECCV 2020 • Senthil Purushwalkam, Tian Ye, Saurabh Gupta, Abhinav Gupta
During training, given a pair of videos, we compute cycles that connect patches in a given frame in the first video by matching through frames in the second video.
1 code implementation • NeurIPS 2020 • Victoria Dean, Shubham Tulsiani, Abhinav Gupta
Exploration is one of the core challenges in reinforcement learning.
no code implementations • 3 Jul 2020 • Dhiraj Gandhi, Abhinav Gupta, Lerrel Pinto
In this work, we perform the first large-scale study of the interactions between sound and robotic action.
2 code implementations • NeurIPS 2020 • Devendra Singh Chaplot, Dhiraj Gandhi, Abhinav Gupta, Ruslan Salakhutdinov
We propose a modular system called, `Goal-Oriented Semantic Exploration' which builds an episodic semantic map and uses it to explore the environment efficiently based on the goal object category.
Ranked #4 on
Robot Navigation
on Habitat 2020 Object Nav test-std
no code implementations • WS 2020 • Abhinav Gupta, Cinjon Resnick, Jakob Foerster, Andrew Dai, Kyunghyun Cho
Our hypothesis is that there should be a specific range of model capacity and channel bandwidth that induces compositional structure in the resulting language and consequently encourages systematic generalization.
no code implementations • 29 Jun 2020 • Kenneth Marino, Rob Fergus, Arthur Szlam, Abhinav Gupta
This paper formulates hypothesis verification as an RL problem.
no code implementations • ICML 2020 • Tanmay Shankar, Abhinav Gupta
In this paper, we address the discovery of robotic options from demonstrations in an unsupervised manner.
no code implementations • ECCV 2020 • Devendra Singh Chaplot, Helen Jiang, Saurabh Gupta, Abhinav Gupta
Instead, we explore a self-supervised approach for training our exploration policy by introducing a notion of semantic curiosity.
no code implementations • CVPR 2020 • Devendra Singh Chaplot, Ruslan Salakhutdinov, Abhinav Gupta, Saurabh Gupta
This paper studies the problem of image-goal navigation which involves navigating to the location indicated by a goal image in a novel previously unseen environment.
2 code implementations • ICLR 2020 • Devendra Singh Chaplot, Dhiraj Gandhi, Saurabh Gupta, Abhinav Gupta, Ruslan Salakhutdinov
The use of learning provides flexibility with respect to input modalities (in the SLAM module), leverages structural regularities of the world (in global policies), and provides robustness to errors in state estimation (in local policies).
1 code implementation • CVPR 2020 • Nilesh Kulkarni, Abhinav Gupta, David F. Fouhey, Shubham Tulsiani
We tackle the tasks of: 1) predicting a Canonical Surface Mapping (CSM) that indicates the mapping from 2D pixels to corresponding points on a canonical template shape, and 2) inferring the articulation and pose of the template corresponding to the input image.
2 code implementations • CVPR 2020 • Kiana Ehsani, Shubham Tulsiani, Saurabh Gupta, Ali Farhadi, Abhinav Gupta
Our quantitative and qualitative results show that (a) we can predict meaningful forces from videos whose effects lead to accurate imitation of the motions observed, (b) by jointly optimizing for contact point and force prediction, we can improve the performance on both tasks in comparison to independent training, and (c) we can learn a representation from this model that generalizes to novel objects using few shot examples.
no code implementations • 12 Mar 2020 • Gunnar A. Sigurdsson, Abhinav Gupta, Cordelia Schmid, Karteek Alahari
Eye movement and strategic placement of the visual field onto the retina, gives animals increased resolution of the scene and suppresses distracting information.
no code implementations • ICLR 2020 • Rohan Chitnis, Shubham Tulsiani, Saurabh Gupta, Abhinav Gupta
Our key idea is that a good guiding principle for intrinsic motivation in synergistic tasks is to take actions which affect the world in ways that would not be achieved if the agents were acting on their own.
1 code implementation • ICLR 2020 • Ryan Lowe, Abhinav Gupta, Jakob Foerster, Douwe Kiela, Joelle Pineau
A promising approach for teaching artificial agents to use natural language involves using human-in-the-loop training.
no code implementations • 4 Feb 2020 • Agnieszka Słowik, Abhinav Gupta, William L. Hamilton, Mateja Jamnik, Sean B. Holden, Christopher Pal
In order to communicate, humans flatten a complex representation of ideas and their attributes into a single word or a sentence.
no code implementations • 24 Jan 2020 • Agnieszka Słowik, Abhinav Gupta, William L. Hamilton, Mateja Jamnik, Sean B. Holden
Recent findings in neuroscience suggest that the human brain represents information in a geometric structure (for instance, through conceptual spaces).
no code implementations • ICLR 2020 • Tanmay Shankar, Shubham Tulsiani, Lerrel Pinto, Abhinav Gupta
In this paper, we present an approach to learn recomposable motor primitives across large-scale and diverse manipulation demonstrations.
1 code implementation • CVPR 2020 • Xueting Yan, Ishan Misra, Abhinav Gupta, Deepti Ghadiyaram, Dhruv Mahajan
Pre-training convolutional neural networks with weakly-supervised and self-supervised strategies is becoming increasingly popular for several computer vision tasks.
Ranked #52 on
Image Classification
on iNaturalist 2018
1 code implementation • NeurIPS 2019 • Pratyusha Sharma, Deepak Pathak, Abhinav Gupta
We study a generalized setup for learning from demonstration to build an agent that can manipulate novel objects in unseen scenarios by looking at only a single video of human demonstration from a third-person perspective.
no code implementations • WS 2019 • Abhinav Gupta, Ryan Lowe, Jakob Foerster, Douwe Kiela, Joelle Pineau
Once the meta-learning agent is able to quickly adapt to each population of agents, it can be deployed in new populations, including populations speaking human language.
1 code implementation • 24 Oct 2019 • Cinjon Resnick, Abhinav Gupta, Jakob Foerster, Andrew M. Dai, Kyunghyun Cho
In this paper, we investigate the learning biases that affect the efficacy and compositionality of emergent languages.
1 code implementation • 8 Oct 2019 • Yufei Ye, Dhiraj Gandhi, Abhinav Gupta, Shubham Tulsiani
We present an approach to learn an object-centric forward model, and show that this allows us to plan for sequences of actions to achieve distant desired goals.
no code implementations • 30 Sep 2019 • Rohan Chitnis, Shubham Tulsiani, Saurabh Gupta, Abhinav Gupta
Our insight is that for many tasks, the learning process can be decomposed into learning a state-independent task schema (a sequence of skills to execute) and a policy to choose the parameterizations of the skills in a state-dependent manner.
no code implementations • 25 Sep 2019 • Kenneth Marino, Rob Fergus, Arthur Szlam, Abhinav Gupta
In order to train the agents, we exploit the underlying structure in the majority of hypotheses -- they can be formulated as triplets (pre-condition, action sequence, post-condition).
no code implementations • 25 Sep 2019 • Dhiraj Gandhi, Abhinav Gupta, Lerrel Pinto
In this work, we perform the first large-scale study of the interactions between sound and robotic action.
2 code implementations • ICLR 2020 • William Whitney, Rajat Agarwal, Kyunghyun Cho, Abhinav Gupta
In this paper we consider self-supervised representation learning to improve sample efficiency in reinforcement learning (RL).
2 code implementations • ICCV 2019 • Yufei Ye, Maneesh Singh, Abhinav Gupta, Shubham Tulsiani
We present an approach for pixel-level future prediction given an input image of a scene.
1 code implementation • ICLR 2019 • Wenxuan Zhou, Lerrel Pinto, Abhinav Gupta
A key challenge in reinforcement learning (RL) is environment generalization: a policy trained to solve a task in one environment often fails to solve the same task in a slightly different test environment.
1 code implementation • ICCV 2019 • Nilesh Kulkarni, Abhinav Gupta, Shubham Tulsiani
We explore the task of Canonical Surface Mapping (CSM).
2 code implementations • 19 Jun 2019 • Adithyavairavan Murali, Tao Chen, Kalyan Vasudev Alwala, Dhiraj Gandhi, Lerrel Pinto, Saurabh Gupta, Abhinav Gupta
This paper introduces PyRobot, an open-source robotics framework for research and benchmarking.
1 code implementation • 10 Jun 2019 • Deepak Pathak, Dhiraj Gandhi, Abhinav Gupta
In this paper, we propose a formulation for exploration inspired by the work in active learning literature.
no code implementations • ICCV 2019 • Nilesh Kulkarni, Ishan Misra, Shubham Tulsiani, Abhinav Gupta
We propose an approach to predict the 3D shape and pose for the objects present in a scene.
1 code implementation • ICCV 2019 • Senthil Purushwalkam, Maximilian Nickel, Abhinav Gupta, Marc'Aurelio Ranzato
When extending the evaluation to the generalized setting which accounts also for pairs seen during training, we discover that naive baseline methods perform similarly or better than current approaches.
2 code implementations • ICCV 2019 • Priya Goyal, Dhruv Mahajan, Abhinav Gupta, Ishan Misra
Self-supervised learning aims to learn representations from the data itself without explicit manual supervision.
no code implementations • ICLR 2019 • Kenneth Marino, Abhinav Gupta, Rob Fergus, Arthur Szlam
The high-level policy is trained using a sparse, task-dependent reward, and operates by choosing which of the low-level policies to run at any given time.
no code implementations • ICLR 2019 • Senthil Purushwalkam, Abhinav Gupta, Danny M. Kaufman, Bryan Russell
To achieve our results, we introduce the Bounce Dataset comprising 5K RGB-D videos of bouncing trajectories of a foam ball to probe surfaces of varying shapes and materials in everyday scenes including homes and offices.
2 code implementations • ICLR 2019 • Tao Chen, Saurabh Gupta, Abhinav Gupta
Numerous past works have tackled the problem of task-driven navigation.
no code implementations • NeurIPS 2018 • Yin Li, Abhinav Gupta
Our method further learns to propagate information across all vertices on the graph, and is able to project the learned graph representation back into 2D grids.
1 code implementation • NeurIPS 2018 • Tao Chen, Adithyavairavan Murali, Abhinav Gupta
In tasks where knowing the agent dynamics is important for success, we learn an embedding for robot hardware and show that policies conditioned on the encoding of hardware tend to generalize and transfer well.
no code implementations • 16 Oct 2018 • Pratyusha Sharma, Lekha Mohan, Lerrel Pinto, Abhinav Gupta
In order to make progress and capture the space of manipulation, we would need to collect a large-scale dataset of diverse tasks such as pouring, opening bottles, stacking objects etc.
1 code implementation • ICLR 2019 • Wei Yang, Xiaolong Wang, Ali Farhadi, Abhinav Gupta, Roozbeh Mottaghi
Do we use the semantic/functional priors we have built over years to efficiently search and navigate?
3 code implementations • 5 Sep 2018 • Nadine Chang, John A. Pyles, Abhinav Gupta, Michael J. Tarr, Elissa M. Aminoff
Vision science, particularly machine vision, has been revolutionized by introducing large-scale image datasets and statistical learning approaches.
no code implementations • ECCV 2018 • Keizo Kato, Yin Li, Abhinav Gupta
The world of human-object interactions is rich.
no code implementations • ECCV 2018 • Tian Ye, Xiaolong Wang, James Davidson, Abhinav Gupta
In order to demonstrate that our system models these underlying physical properties, we train our model on collisions of different shapes (cube, cone, cylinder, spheres etc.)
no code implementations • NeurIPS 2018 • Abhinav Gupta, Adithyavairavan Murali, Dhiraj Gandhi, Lerrel Pinto
The models trained with our home dataset showed a marked improvement of 43. 7% over a baseline model trained with data collected in lab.
no code implementations • ECCV 2018 • Xiaolong Wang, Abhinav Gupta
These nodes are connected by two types of relations: (i) similarity relations capturing the long range dependencies between correlated objects and (ii) spatial-temporal relations capturing the interactions between nearby objects.
Ranked #31 on
Action Classification
on Charades
(using extra training data)
no code implementations • 10 May 2018 • Adithyavairavan Murali, Yin Li, Dhiraj Gandhi, Abhinav Gupta
We believe this is the first attempt at learning to grasp with only tactile sensing and without any prior object knowledge.
no code implementations • 25 Apr 2018 • Gunnar A. Sigurdsson, Abhinav Gupta, Cordelia Schmid, Ali Farhadi, Karteek Alahari
In this paper we describe the egocentric aspect of the dataset and present annotations for Charades-Ego with 68, 536 activity instances in 68. 8 hours of first and third-person video, making it one of the largest and most diverse egocentric datasets available.
1 code implementation • CVPR 2018 • Gunnar A. Sigurdsson, Abhinav Gupta, Cordelia Schmid, Ali Farhadi, Karteek Alahari
Several theories in cognitive neuroscience suggest that when people interact with the world, or simulate interactions, they do so from a first-person egocentric perspective, and seamlessly transfer knowledge between third-person (observer) and first-person (actor).
no code implementations • CVPR 2017 • Xiaolong Wang, Rohit Girdhar, Abhinav Gupta
In this paper, we tackle the challenge of creating one of the biggest dataset for learning affordances.
no code implementations • CVPR 2018 • Xinlei Chen, Li-Jia Li, Li Fei-Fei, Abhinav Gupta
The framework consists of two core modules: a local module that uses spatial memory to store previous beliefs with parallel updates; and a global graph-reasoning module.
3 code implementations • CVPR 2018 • Xiaolong Wang, Yufei Ye, Abhinav Gupta
Given a learned knowledge graph (KG), our approach takes as input semantic embeddings for each node (representing visual category).
2 code implementations • 14 Dec 2017 • Eric Kolve, Roozbeh Mottaghi, Winson Han, Eli VanderBilt, Luca Weihs, Alvaro Herrasti, Matt Deitke, Kiana Ehsani, Daniel Gordon, Yuke Zhu, Aniruddha Kembhavi, Abhinav Gupta, Ali Farhadi
We introduce The House Of inteRactions (THOR), a framework for visual AI research, available at http://ai2thor. allenai. org.
no code implementations • CVPR 2018 • Ishan Misra, Ross Girshick, Rob Fergus, Martial Hebert, Abhinav Gupta, Laurens van der Maaten
We also show that our model asks questions that generalize to state-of-the-art VQA models and to novel test time distributions.
no code implementations • 3 Dec 2017 • Laura Graesser, Abhinav Gupta, Lakshay Sharma, Evelina Bakhturina
In this project we analysed how much semantic information images carry, and how much value image data can add to sentiment analysis of the text associated with the images.
no code implementations • 1 Dec 2017 • Abhinav Gupta, Yajie Miao, Leonardo Neves, Florian Metze
We are working on a corpus of "how-to" videos from the web, and the idea is that an object that can be seen ("car"), or a scene that is being detected ("kitchen") can be used to condition both models on the "context" of the recording, thereby reducing perplexity and improving transcription.
27 code implementations • CVPR 2018 • Xiaolong Wang, Ross Girshick, Abhinav Gupta, Kaiming He
Both convolutional and recurrent operations are building blocks that process one local neighborhood at a time.
Ranked #8 on
Action Classification
on Toyota Smarthome dataset
(using extra training data)
1 code implementation • 24 Aug 2017 • Xinchen Yan, Jasmine Hsu, Mohi Khansari, Yunfei Bai, Arkanath Pathak, Abhinav Gupta, James Davidson, Honglak Lee
Our contributions are fourfold: (1) To best of our knowledge, we are presenting for the first time a method to learn a 6-DOF grasping net from RGBD input; (2) We build a grasping dataset from demonstrations in virtual reality with rich sensory and interaction annotations.
1 code implementation • ICCV 2017 • Gunnar A. Sigurdsson, Olga Russakovsky, Abhinav Gupta
We present the many kinds of information that will be needed to achieve substantial gains in activity understanding: objects, verbs, intent, and sequential reasoning.
no code implementations • ICCV 2017 • Xiaolong Wang, Kaiming He, Abhinav Gupta
The objects are connected by two types of edges which correspond to two types of invariance: "different instances but a similar viewpoint and category" and "different viewpoints of the same instance".
no code implementations • 4 Aug 2017 • Adithyavairavan Murali, Lerrel Pinto, Dhiraj Gandhi, Abhinav Gupta
Recent self-supervised learning approaches focus on using a few thousand data points to learn policies for high-level, low-dimensional action spaces.
no code implementations • 2 Aug 2017 • Abhinav Gupta, Agrim Khanna, Anmol Jagetia, Devansh Sharma, Sanchit Alekh, Vaibhav Choudhary
Keystroke Dynamics is a novel Biometric Technique; it is not only unobtrusive, but also transparent and inexpensive.
no code implementations • ICCV 2017 • Yuan Yuan, Xiaodan Liang, Xiaolong Wang, Dit-yan Yeung, Abhinav Gupta
A common issue, however, is that objects of interest that are not involved in human actions are often absent in global action descriptions known as "missing label".
Ranked #3 on
Weakly Supervised Object Detection
on Charades
2 code implementations • ICCV 2017 • Chen Sun, Abhinav Shrivastava, Saurabh Singh, Abhinav Gupta
What will happen if we increase the dataset size by 10x or 100x?
Ranked #2 on
Semantic Segmentation
on PASCAL VOC 2007
no code implementations • CVPR 2017 • Ishan Misra, Abhinav Gupta, Martial Hebert
In this paper, we present a simple method that respects contextuality in order to compose classifiers of known visual concepts.
no code implementations • ICCV 2017 • Yuke Zhu, Daniel Gordon, Eric Kolve, Dieter Fox, Li Fei-Fei, Abhinav Gupta, Roozbeh Mottaghi, Ali Farhadi
A crucial capability of real-world intelligent agents is their ability to plan a sequence of actions to achieve their goals in the visual world.
no code implementations • 16 May 2017 • Wen Li, Li-Min Wang, Wei Li, Eirikur Agustsson, Jesse Berent, Abhinav Gupta, Rahul Sukthankar, Luc van Gool
The 2017 WebVision challenge consists of two tracks, the image classification task on WebVision test set, and the transfer learning task on PASCAL VOC 2012 dataset.
1 code implementation • ICCV 2017 • Jacob Walker, Kenneth Marino, Abhinav Gupta, Martial Hebert
First we explicitly model the high level structure of active objects in the scene---humans---and use a VAE to model the possible future movements of humans in the pose space.
Ranked #2 on
Human Pose Forecasting
on Human3.6M
(CMD metric)
1 code implementation • 19 Apr 2017 • Dhiraj Gandhi, Lerrel Pinto, Abhinav Gupta
An alternative is to use simulation.
36 code implementations • ICCV 2017 • Xinlei Chen, Abhinav Gupta
On the other hand, modeling object-object relationships requires {\bf spatial} reasoning -- not only do we need a memory to store the spatial layout, but also a effective reasoning module to extract spatial patterns.
1 code implementation • CVPR 2017 • Siddha Ganju, Olga Russakovsky, Abhinav Gupta
For instance, the question "what is the breed of the dog?"
4 code implementations • CVPR 2017 • Xiaolong Wang, Abhinav Shrivastava, Abhinav Gupta
We propose to learn an adversarial network that generates examples with occlusions and deformations.
Ranked #19 on
Object Detection
on PASCAL VOC 2007
(using extra training data)
no code implementations • CVPR 2017 • Rohit Girdhar, Deva Ramanan, Abhinav Gupta, Josef Sivic, Bryan Russell
In this work, we introduce a new video representation for action classification that aggregates local convolutional features across the entire spatio-temporal extent of the video.
6 code implementations • ICML 2017 • Lerrel Pinto, James Davidson, Rahul Sukthankar, Abhinav Gupta
Deep neural networks coupled with fast simulation and improved computation have led to recent successes in the field of reinforcement learning (RL).
1 code implementation • 21 Feb 2017 • Aayush Bansal, Xinlei Chen, Bryan Russell, Abhinav Gupta, Deva Ramanan
We explore design principles for general pixel-level prediction problems, from low-level edge detection to mid-level surface normal estimation to high-level semantic segmentation.
47 code implementations • 7 Feb 2017 • Xinlei Chen, Abhinav Gupta
We adapted the join-training scheme of Faster RCNN framework from Caffe to TensorFlow as a baseline implementation for object detection.
no code implementations • CVPR 2017 • Andreas Veit, Neil Alldrin, Gal Chechik, Ivan Krasin, Abhinav Gupta, Serge Belongie
For the small clean set of annotations we use a quarter of the validation set with ~40k images.
no code implementations • 20 Dec 2016 • David F. Fouhey, Abhinav Gupta, Andrew Zisserman
Our first objective is to infer these 3D shape attributes from a single image.
1 code implementation • 20 Dec 2016 • Abhinav Shrivastava, Rahul Sukthankar, Jitendra Malik, Abhinav Gupta
But most of these fine details are lost in the early convolutional layers.
Ranked #218 on
Object Detection
on COCO test-dev
2 code implementations • CVPR 2017 • Gunnar A. Sigurdsson, Santosh Divvala, Ali Farhadi, Abhinav Gupta
Actions are more than just movements and trajectories: we cook to eat and we hold a cup to drink from it.
Ranked #15 on
Action Detection
on Charades
no code implementations • CVPR 2017 • Kenneth Marino, Ruslan Salakhutdinov, Abhinav Gupta
One characteristic that sets humans apart from modern learning-based computer vision algorithms is the ability to acquire knowledge about the world and use that knowledge to reason about the visual world.
1 code implementation • 5 Oct 2016 • Lerrel Pinto, James Davidson, Abhinav Gupta
Due to large number of experiences required for training, most of these approaches use a self-supervised paradigm: using sensors to measure success/failure.
no code implementations • 28 Sep 2016 • Lerrel Pinto, Abhinav Gupta
The argument of the difficulty in scalability to multiple tasks is well founded, since training these tasks often require hundreds or thousands of examples.
no code implementations • 21 Sep 2016 • Aayush Bansal, Xinlei Chen, Bryan Russell, Abhinav Gupta, Deva Ramanan
We explore architectures for general pixel-level prediction problems, from low-level edge detection to mid-level surface normal estimation to high-level semantic segmentation.
no code implementations • 18 Sep 2016 • Senthil Purushwalkam, Abhinav Gupta
We propose an unsupervised method to learn pose features from videos that exploits a signal which is complementary to appearance and can be used as supervision: motion.
2 code implementations • 16 Sep 2016 • Yuke Zhu, Roozbeh Mottaghi, Eric Kolve, Joseph J. Lim, Abhinav Gupta, Li Fei-Fei, Ali Farhadi
To address the second issue, we propose AI2-THOR framework, which provides an environment with high-quality 3D scenes and physics engine.
no code implementations • 25 Jul 2016 • Gunnar A. Sigurdsson, Olga Russakovsky, Ali Farhadi, Ivan Laptev, Abhinav Gupta
We conclude that the optimal strategy is to ask as many questions as possible in a HIT (up to 52 binary questions after watching a 30-second video clip in our experiments).
no code implementations • 25 Jun 2016 • Jacob Walker, Carl Doersch, Abhinav Gupta, Martial Hebert
We show that our method is able to successfully predict events in a wide variety of scenes and can produce multiple different predictions when the future is ambiguous.
no code implementations • CVPR 2016 • David F. Fouhey, Abhinav Gupta, Andrew Zisserman
In this paper we investigate 3D attributes as a means to understand the shape of an object in a single image.
1 code implementation • 14 Apr 2016 • Gunnar A. Sigurdsson, Xinlei Chen, Abhinav Gupta
What does a typical visit to Paris look like?
5 code implementations • CVPR 2016 • Abhinav Shrivastava, Abhinav Gupta, Ross Girshick
Our motivation is the same as it has always been -- detection datasets contain an overwhelming number of easy examples and a small number of hard examples.
Ranked #6 on
Face Verification
on Trillion Pairs Dataset
1 code implementation • CVPR 2016 • Ishan Misra, Abhinav Shrivastava, Abhinav Gupta, Martial Hebert
In this paper, we propose a principled approach to learn shared representations in ConvNets using multi-task learning.
Ranked #97 on
Semantic Segmentation
on NYU Depth v2
no code implementations • 6 Apr 2016 • Gunnar A. Sigurdsson, Gül Varol, Xiaolong Wang, Ali Farhadi, Ivan Laptev, Abhinav Gupta
Each video is annotated by multiple free-text descriptions, action labels, action intervals and classes of interacted objects.
no code implementations • 5 Apr 2016 • Lerrel Pinto, Dhiraj Gandhi, Yuanfeng Han, Yong-Lae Park, Abhinav Gupta
We argue that biological agents use physical interactions with the world to learn visual representations unlike current vision systems which just use passive observations (images and videos downloaded from web).
no code implementations • CVPR 2016 • Aayush Bansal, Bryan Russell, Abhinav Gupta
We introduce an approach that leverages surface normal predictions, along with appearance cues, to retrieve 3D models for objects depicted in 2D still images from a large CAD object library.
2 code implementations • 29 Mar 2016 • Rohit Girdhar, David F. Fouhey, Mikel Rodriguez, Abhinav Gupta
The network consists of two components: (a) an autoencoder that ensures the representation is generative; and (b) a convolutional network that ensures the representation is predictable.
no code implementations • 17 Mar 2016 • Roozbeh Mottaghi, Mohammad Rastegari, Abhinav Gupta, Ali Farhadi
To build a dataset of forces in scenes, we reconstructed all images in SUN RGB-D dataset in a physics simulator to estimate the physical movements of objects caused by external forces applied to them.
no code implementations • 17 Mar 2016 • Xiaolong Wang, Abhinav Gupta
Current generative frameworks use end-to-end learning and generate images by sampling from uniform noise distribution.
1 code implementation • CVPR 2016 • Xiaolong Wang, Ali Farhadi, Abhinav Gupta
In this paper, we propose a novel representation for actions by modeling an action as a transformation which changes the state of the environment before the action happens (precondition) to the state after the action (effect).
no code implementations • ICCV 2015 • David F. Fouhey, Wajahat Hussain, Abhinav Gupta, Martial Hebert
Do we really need 3D labels in order to learn how to predict 3D?
no code implementations • 23 Sep 2015 • Lerrel Pinto, Abhinav Gupta
Our experiments clearly show the benefit of using large-scale datasets (and multi-stage training) for the task of grasping.
no code implementations • CVPR 2015 • Xinlei Chen, Alan Ritter, Abhinav Gupta, Tom Mitchell
We present a co-clustering framework that can be used to discover multiple semantic and visual senses of a given Noun Phrase (NP).
3 code implementations • ICCV 2015 • Carl Doersch, Abhinav Gupta, Alexei A. Efros
This work explores the use of spatial context as a source of free and plentiful supervisory signal for training a rich visual representation.
no code implementations • ICCV 2015 • Xinlei Chen, Abhinav Gupta
Specifically inspired by curriculum learning, we present a two-step approach for CNN training.
no code implementations • 5 May 2015 • David F. Fouhey, Xiaolong Wang, Abhinav Gupta
The field of functional recognition or affordance estimation from images has seen a revival in recent years.
no code implementations • ICCV 2015 • Xiaolong Wang, Abhinav Gupta
Is strong supervision necessary for learning a good visual representation?
no code implementations • ICCV 2015 • Jacob Walker, Abhinav Gupta, Martial Hebert
Because our CNN model makes no assumptions about the underlying scene, it can predict future optical flow on a diverse set of scenarios.
no code implementations • 27 Apr 2015 • Aayush Bansal, Abhinav Shrivastava, Carl Doersch, Abhinav Gupta
Building on the success of recent discriminative mid-level elements, we propose a surprisingly simple approach for object detection which performs comparable to the current state-of-the-art approaches on PASCAL VOC comp-3 detection challenge (no external data).
no code implementations • 19 Jan 2015 • Naiyan Wang, Siyi Li, Abhinav Gupta, Dit-yan Yeung
To fit the characteristics of object tracking, we first pre-train the CNN to recognize what is an object, and then propose to generate a probability map instead of producing a simple class label.
no code implementations • CVPR 2015 • Xiaolong Wang, David F. Fouhey, Abhinav Gupta
We show by incorporating several constraints (man-made, manhattan world) and meaningful intermediate representations (room layout, edge labels) in the architecture leads to state of the art performance on surface normal estimation.
no code implementations • CVPR 2014 • Xinlei Chen, Abhinav Shrivastava, Abhinav Gupta
In this paper, we propose to enrich these knowledge bases by automatically discovering objects and their segmentations from noisy Internet images.
no code implementations • CVPR 2014 • Jacob Walker, Abhinav Gupta, Martial Hebert
In this paper we present a conceptually simple but surprisingly powerful method for visual prediction which combines the effectiveness of mid-level visual elements with temporal modeling.
no code implementations • NeurIPS 2013 • Carl Doersch, Abhinav Gupta, Alexei A. Efros
We also propose the Purity-Coverage plot as a principled way of experimentally analyzing and evaluating different visual discovery approaches, and compare our method against prior work on the Paris Street View dataset.
no code implementations • CVPR 2013 • Arpit Jain, Abhinav Gupta, Mikel Rodriguez, Larry S. Davis
representation for videos based on mid-level discriminative spatio-temporal patches.
no code implementations • NeurIPS 2010 • Abhinav Gupta, Martial Hebert, Takeo Kanade, David M. Blei
There has been a recent push in extraction of 3D spatial layout of scenes.
no code implementations • NeurIPS 2008 • Abhinav Gupta, Jianbo Shi, Larry S. Davis
Using an analogous reasoning, we present an approach that combines bag-of-words and spatial models to perform semantic and syntactic analysis for recognition of an object based on its internal appearance and its context.