Search Results for author: Peter Anderson

Found 28 papers, 18 papers with code

Pathdreamer: A World Model for Indoor Navigation

1 code implementation ICCV 2021 Jing Yu Koh, Honglak Lee, Yinfei Yang, Jason Baldridge, Peter Anderson

People navigating in unfamiliar buildings take advantage of myriad visual, spatial and semantic cues to efficiently achieve their navigation goals.

Semantic Segmentation Vision and Language Navigation

PanGEA: The Panoramic Graph Environment Annotation Toolkit

no code implementations NAACL (ALVR) 2021 Alexander Ku, Peter Anderson, Jordi Pont-Tuset, Jason Baldridge

PanGEA, the Panoramic Graph Environment Annotation toolkit, is a lightweight toolkit for collecting speech and text annotations in photo-realistic 3D environments.

On the Evaluation of Vision-and-Language Navigation Instructions

no code implementations EACL 2021 Ming Zhao, Peter Anderson, Vihan Jain, Su Wang, Alexander Ku, Jason Baldridge, Eugene Ie

Vision-and-Language Navigation wayfinding agents can be enhanced by exploiting automatically generated navigation instructions.

Vision and Language Navigation

Where Are You? Localization from Embodied Dialog

2 code implementations EMNLP 2020 Meera Hahn, Jacob Krantz, Dhruv Batra, Devi Parikh, James M. Rehg, Stefan Lee, Peter Anderson

In this paper, we focus on the LED task -- providing a strong baseline model with detailed ablations characterizing both dataset biases and the importance of various modeling choices.

Visual Dialog

Sim-to-Real Transfer for Vision-and-Language Navigation

1 code implementation7 Nov 2020 Peter Anderson, Ayush Shrivastava, Joanne Truong, Arjun Majumdar, Devi Parikh, Dhruv Batra, Stefan Lee

We study the challenging problem of releasing a robot in a previously unseen environment, and having it follow unconstrained natural language navigation instructions.

Vision and Language Navigation

Extended Abstract: Improving Vision-and-Language Navigation with Image-Text Pairs from the Web

no code implementations ICML Workshop LaReL 2020 Arjun Majumdar, Ayush Shrivastava, Stefan Lee, Peter Anderson, Devi Parikh, Dhruv Batra

Following a navigation instruction such as 'Walk down the stairs and stop near the sofa' requires an agent to ground scene elements referenced via language (e. g.'stairs') to visual content in the environment (pixels corresponding to 'stairs').

Vision and Language Navigation

Improving Vision-and-Language Navigation with Image-Text Pairs from the Web

1 code implementation ECCV 2020 Arjun Majumdar, Ayush Shrivastava, Stefan Lee, Peter Anderson, Devi Parikh, Dhruv Batra

Following a navigation instruction such as 'Walk down the stairs and stop at the brown sofa' requires embodied AI agents to ground scene elements referenced via language (e. g. 'stairs') to visual content in the environment (pixels corresponding to 'stairs').

Vision and Language Navigation

Chasing Ghosts: Instruction Following as Bayesian State Tracking

1 code implementation NeurIPS 2019 Peter Anderson, Ayush Shrivastava, Devi Parikh, Dhruv Batra, Stefan Lee

Our experiments show that our approach outperforms a strong LingUNet baseline when predicting the goal location on the map.

Vision and Language Navigation

REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments

1 code implementation CVPR 2020 Yuankai Qi, Qi Wu, Peter Anderson, Xin Wang, William Yang Wang, Chunhua Shen, Anton Van Den Hengel

One of the long-term challenges of robotics is to enable robots to interact with humans in the visual world via natural language, as humans are visual animals that communicate through language.

Referring Expression Vision and Language Navigation

nocaps: novel object captioning at scale

2 code implementations ICCV 2019 Harsh Agrawal, Karan Desai, YuFei Wang, Xinlei Chen, Rishabh Jain, Mark Johnson, Dhruv Batra, Devi Parikh, Stefan Lee, Peter Anderson

To encourage the development of image captioning models that can learn visual concepts from alternative data sources, such as object detection datasets, we present the first large-scale benchmark for this task.

Image Captioning Object Detection

Disfluency Detection using Auto-Correlational Neural Networks

4 code implementations EMNLP 2018 Paria Jamshid Lou, Peter Anderson, Mark Johnson

In recent years, the natural language processing community has moved away from task-specific feature engineering, i. e., researchers discovering ad-hoc feature representations for various tasks, in favor of general-purpose methods that learn the input representation by themselves.

Feature Engineering

On Evaluation of Embodied Navigation Agents

9 code implementations18 Jul 2018 Peter Anderson, Angel Chang, Devendra Singh Chaplot, Alexey Dosovitskiy, Saurabh Gupta, Vladlen Koltun, Jana Kosecka, Jitendra Malik, Roozbeh Mottaghi, Manolis Savva, Amir R. Zamir

Skillful mobile operation in three-dimensional environments is a primary topic of study in Artificial Intelligence.

Face-Cap: Image Captioning using Facial Expression Analysis

1 code implementation6 Jul 2018 Omid Mohamad Nezami, Mark Dras, Peter Anderson, Len Hamey

In this work, we present two variants of our Face-Cap model, which embed facial expression features in different ways, to generate image captions.

Image Captioning

Predicting accuracy on large datasets from smaller pilot data

no code implementations ACL 2018 Mark Johnson, Peter Anderson, Mark Dras, Mark Steedman

Because obtaining training data is often the most difficult part of an NLP or ML project, we develop methods for predicting how much data is required to achieve a desired test accuracy by extrapolating results from models trained on a small pilot training dataset.

Document Classification

Connecting Language and Vision to Actions

no code implementations ACL 2018 Peter Anderson, Abhishek Das, Qi Wu

A long-term goal of AI research is to build intelligent agents that can see the rich visual environment around us, communicate this understanding in natural language to humans and other agents, and act in a physical or embodied environment.

Image Captioning Language Modelling +4

Partially-Supervised Image Captioning

no code implementations NeurIPS 2018 Peter Anderson, Stephen Gould, Mark Johnson

To address this problem, we teach image captioning models new visual concepts from labeled images and object detection datasets.

Image Captioning Object Detection

Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments

7 code implementations CVPR 2018 Peter Anderson, Qi Wu, Damien Teney, Jake Bruce, Mark Johnson, Niko Sünderhauf, Ian Reid, Stephen Gould, Anton Van Den Hengel

This is significant because a robot interpreting a natural-language navigation instruction on the basis of what it sees is carrying out a vision and language process that is similar to Visual Question Answering.

Translation Vision and Language Navigation +2

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

62 code implementations CVPR 2018 Peter Anderson, Xiaodong He, Chris Buehler, Damien Teney, Mark Johnson, Stephen Gould, Lei Zhang

Top-down visual attention mechanisms have been used extensively in image captioning and visual question answering (VQA) to enable deeper image understanding through fine-grained analysis and even multiple steps of reasoning.

Image Captioning Visual Question Answering +1

SPICE: Semantic Propositional Image Caption Evaluation

9 code implementations29 Jul 2016 Peter Anderson, Basura Fernando, Mark Johnson, Stephen Gould

There is considerable interest in the task of automatically generating image captions.

Image Captioning

Cannot find the paper you are looking for? You can Submit a new open access paper.