Search Results for author: Abhishek Das

Found 39 papers, 18 papers with code

Habitat-Web: Learning Embodied Object-Search Strategies from Human Demonstrations at Scale

no code implementations7 Apr 2022 Ram Ramrakhya, Eric Undersander, Dhruv Batra, Abhishek Das

We present a large-scale study of imitating human demonstrations on tasks that require a virtual robot to search for objects in new environments -- (1) ObjectGoal Navigation (e. g. 'find & go to a chair') and (2) Pick&Place (e. g. 'find mug, pick mug, find counter, place mug on counter').

Imitation Learning

How Do Graph Networks Generalize to Large and Diverse Molecular Systems?

no code implementations6 Apr 2022 Johannes Gasteiger, Muhammed Shuaibi, Anuroop Sriram, Stephan Günnemann, Zachary Ulissi, C. Lawrence Zitnick, Abhishek Das

Based on this analysis, we identify a smaller dataset that correlates well with the full OC20 dataset, and propose the GemNet-OC model, which outperforms the previous state-of-the-art on OC20 by 16%, while reducing training time by a factor of 10.

Towards Training Billion Parameter Graph Neural Networks for Atomic Simulations

no code implementations ICLR 2022 Anuroop Sriram, Abhishek Das, Brandon M. Wood, Siddharth Goyal, C. Lawrence Zitnick

Recent progress in Graph Neural Networks (GNNs) for modeling atomic simulations has the potential to revolutionize catalyst discovery, which is a key step in making progress towards the energy breakthroughs needed to combat climate change.

NarrationBot and InfoBot: A Hybrid System for Automated Video Description

no code implementations7 Nov 2021 Shasta Ihorn, Yue-Ting Siu, Aditya Bodi, Lothar Narins, Jose M. Castanon, Yash Kant, Abhishek Das, Ilmi Yoon, Pooyan Fazli

To overcome the increasing gaps in video accessibility, we developed a hybrid system of two tools to 1) automatically generate descriptions for videos and 2) provide answers or additional descriptions in response to user queries on a video.

Video Description

Rotation Invariant Graph Neural Networks using Spin Convolutions

no code implementations17 Jun 2021 Muhammed Shuaibi, Adeesh Kolluru, Abhishek Das, Aditya Grover, Anuroop Sriram, Zachary Ulissi, C. Lawrence Zitnick

We introduce a novel approach to modeling angular information between sets of neighboring atoms in a graph neural network.


ForceNet: A Graph Neural Network for Large-Scale Quantum Calculations

no code implementations2 Mar 2021 Weihua Hu, Muhammed Shuaibi, Abhishek Das, Siddharth Goyal, Anuroop Sriram, Jure Leskovec, Devi Parikh, C. Lawrence Zitnick

By not imposing explicit physical constraints, we can flexibly design expressive models while maintaining their computational efficiency.

Data Augmentation

Multi-Image Steganography Using Deep Neural Networks

1 code implementation2 Jan 2021 Abhishek Das, Japsimar Singh Wahi, Mansi Anand, Yugant Rana

Steganography is the science of hiding a secret message within an ordinary public message.

Image Steganography

ForceNet: A Graph Neural Network for Large-Scale Quantum Chemistry Simulation

no code implementations1 Jan 2021 Weihua Hu, Muhammed Shuaibi, Abhishek Das, Siddharth Goyal, Anuroop Sriram, Jure Leskovec, Devi Parikh, Larry Zitnick

We use ForceNet to perform quantum chemistry simulations, where ForceNet is able to achieve 4x higher success rate than existing ML models.

Auxiliary Tasks and Exploration Enable ObjectGoal Navigation

no code implementations ICCV 2021 Joel Ye, Dhruv Batra, Abhishek Das, Erik Wijmans

We instead re-enable a generic learned agent by adding auxiliary learning tasks and an exploration reward.

Auxiliary Learning

Detecting Hate Speech in Multi-modal Memes

1 code implementation29 Dec 2020 Abhishek Das, Japsimar Singh Wahi, SiYao Li

A crucial characteristic of the challenge is that it includes "benign confounders" to counter the possibility of models exploiting unimodal priors.

Hate Speech Detection Image Captioning +4

Smart Refrigerator using Internet of Things and Android

1 code implementation18 Dec 2020 Abhishek Das, Vivek Dhuri, Ranjushree Pal

The kitchen is regarded as the central unit of the traditional as well as modern homes.

Human-Computer Interaction

The Open Catalyst 2020 (OC20) Dataset and Community Challenges

2 code implementations20 Oct 2020 Lowik Chanussot, Abhishek Das, Siddharth Goyal, Thibaut Lavril, Muhammed Shuaibi, Morgane Riviere, Kevin Tran, Javier Heras-Domingo, Caleb Ho, Weihua Hu, Aini Palizhati, Anuroop Sriram, Brandon Wood, Junwoong Yoon, Devi Parikh, C. Lawrence Zitnick, Zachary Ulissi

Catalyst discovery and optimization is key to solving many societal and energy challenges including solar fuels synthesis, long-term energy storage, and renewable fertilizer production.

An Introduction to Electrocatalyst Design using Machine Learning for Renewable Energy Storage

no code implementations14 Oct 2020 C. Lawrence Zitnick, Lowik Chanussot, Abhishek Das, Siddharth Goyal, Javier Heras-Domingo, Caleb Ho, Weihua Hu, Thibaut Lavril, Aini Palizhati, Morgane Riviere, Muhammed Shuaibi, Anuroop Sriram, Kevin Tran, Brandon Wood, Junwoong Yoon, Devi Parikh, Zachary Ulissi

As we increase our reliance on renewable energy sources such as wind and solar, which produce intermittent power, storage is needed to transfer power from times of peak generation to peak demand.

Auxiliary Tasks Speed Up Learning PointGoal Navigation

1 code implementation9 Jul 2020 Joel Ye, Dhruv Batra, Erik Wijmans, Abhishek Das

PointGoal Navigation is an embodied task that requires agents to navigate to a specified point in an unseen environment.

PointGoal Navigation

Feel The Music: Automatically Generating A Dance For An Input Song

1 code implementation21 Jun 2020 Purva Tendulkar, Abhishek Das, Aniruddha Kembhavi, Devi Parikh

We encode intuitive, flexible heuristics for what a 'good' dance is: the structure of the dance should align with the structure of the music.

Probing Emergent Semantics in Predictive Agents via Question Answering

no code implementations ICML 2020 Abhishek Das, Federico Carnevale, Hamza Merzic, Laura Rimell, Rosalia Schneider, Josh Abramson, Alden Hung, Arun Ahuja, Stephen Clark, Gregory Wayne, Felix Hill

Recent work has shown how predictive modeling can endow agents with rich knowledge of their surroundings, improving their ability to act in complex environments.

Question Answering

Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline

1 code implementation ECCV 2020 Vishvak Murahari, Dhruv Batra, Devi Parikh, Abhishek Das

Next, we find that additional finetuning using "dense" annotations in VisDial leads to even higher NDCG -- more than 10% over our base model -- but hurts MRR -- more than 17% below our base model!

Language Modelling Representation Learning +1

DS-VIC: Unsupervised Discovery of Decision States for Transfer in RL

no code implementations25 Sep 2019 Nirbhay Modhe, Prithvijit Chattopadhyay, Mohit Sharma, Abhishek Das, Devi Parikh, Dhruv Batra, Ramakrishna Vedantam

We learn to identify decision states, namely the parsimonious set of states where decisions meaningfully affect the future states an agent can reach in an environment.

Improving Generative Visual Dialog by Answering Diverse Questions

1 code implementation IJCNLP 2019 Vishvak Murahari, Prithvijit Chattopadhyay, Dhruv Batra, Devi Parikh, Abhishek Das

Prior work on training generative Visual Dialog models with reinforcement learning(Das et al.) has explored a Qbot-Abot image-guessing game and shown that this 'self-talk' approach can lead to improved performance at the downstream dialog-conditioned image-guessing task.

Representation Learning Visual Dialog

Embodied Question Answering in Photorealistic Environments with Point Cloud Perception

no code implementations CVPR 2019 Erik Wijmans, Samyak Datta, Oleksandr Maksymets, Abhishek Das, Georgia Gkioxari, Stefan Lee, Irfan Essa, Devi Parikh, Dhruv Batra

To help bridge the gap between internet vision-style problems and the goal of vision for embodied perception we instantiate a large-scale navigation task -- Embodied Question Answering [1] in photo-realistic environments (Matterport 3D).

Embodied Question Answering Question Answering

Response to "Visual Dialogue without Vision or Dialogue" (Massiceti et al., 2018)

no code implementations16 Jan 2019 Abhishek Das, Devi Parikh, Dhruv Batra

In a recent workshop paper, Massiceti et al. presented a baseline model and subsequent critique of Visual Dialog (Das et al., CVPR 2017) that raises what we believe to be unfounded concerns about the dataset and evaluation.

Visual Dialog

TarMAC: Targeted Multi-Agent Communication

no code implementations ICLR 2019 Abhishek Das, Théophile Gervet, Joshua Romoff, Dhruv Batra, Devi Parikh, Michael Rabbat, Joelle Pineau

We propose a targeted communication architecture for multi-agent reinforcement learning, where agents learn both what messages to send and whom to address them to while performing cooperative tasks in partially-observable environments.

Multi-agent Reinforcement Learning

Neural Modular Control for Embodied Question Answering

2 code implementations26 Oct 2018 Abhishek Das, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra

We use imitation learning to warm-start policies at each level of the hierarchy, dramatically increasing sample efficiency, followed by reinforcement learning.

Embodied Question Answering Imitation Learning +2

Connecting Language and Vision to Actions

no code implementations ACL 2018 Peter Anderson, Abhishek Das, Qi Wu

A long-term goal of AI research is to build intelligent agents that can see the rich visual environment around us, communicate this understanding in natural language to humans and other agents, and act in a physical or embodied environment.

Image Captioning Language Modelling +4

Embodied Question Answering

4 code implementations CVPR 2018 Abhishek Das, Samyak Datta, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra

We present a new AI task -- Embodied Question Answering (EmbodiedQA) -- where an agent is spawned at a random location in a 3D environment and asked a question ("What color is the car?").

Embodied Question Answering Question Answering +1

Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning

6 code implementations ICCV 2017 Abhishek Das, Satwik Kottur, José M. F. Moura, Stefan Lee, Dhruv Batra

Specifically, we pose a cooperative 'image guessing' game between two agents -- Qbot and Abot -- who communicate in natural language dialog so that Qbot can select an unseen image from a lineup of images.

reinforcement-learning Visual Dialog +1

Visual Dialog

10 code implementations CVPR 2017 Abhishek Das, Satwik Kottur, Khushi Gupta, Avi Singh, Deshraj Yadav, José M. F. Moura, Devi Parikh, Dhruv Batra

We introduce the task of Visual Dialog, which requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content.

Chatbot Visual Dialog

Grad-CAM: Why did you say that?

1 code implementation22 Nov 2016 Ramprasaath R. Selvaraju, Abhishek Das, Ramakrishna Vedantam, Michael Cogswell, Devi Parikh, Dhruv Batra

We propose a technique for making Convolutional Neural Network (CNN)-based models more transparent by visualizing input regions that are 'important' for predictions -- or visual explanations.

Image Captioning Visual Question Answering +1

Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions?

no code implementations17 Jun 2016 Abhishek Das, Harsh Agrawal, C. Lawrence Zitnick, Devi Parikh, Dhruv Batra

We conduct large-scale studies on `human attention' in Visual Question Answering (VQA) to understand where humans choose to look to answer questions about images.

Question Answering Visual Question Answering +1

Cannot find the paper you are looking for? You can Submit a new open access paper.