Search Results for author: Meera Hahn

Found 12 papers, 3 papers with code

No RL, No Simulation: Learning to Navigate without Navigating

1 code implementation NeurIPS 2021 Meera Hahn, Devendra Chaplot, Shubham Tulsiani, Mustafa Mukadam, James M. Rehg, Abhinav Gupta

Most prior methods for learning navigation policies require access to simulation environments, as they need online policy interaction and rely on ground-truth maps for rewards.

Navigate Reinforcement Learning (RL)

Where Are You? Localization from Embodied Dialog

2 code implementations EMNLP 2020 Meera Hahn, Jacob Krantz, Dhruv Batra, Devi Parikh, James M. Rehg, Stefan Lee, Peter Anderson

In this paper, we focus on the LED task -- providing a strong baseline model with detailed ablations characterizing both dataset biases and the importance of various modeling choices.

Navigate Visual Dialog

Text and Click inputs for unambiguous open vocabulary instance segmentation

1 code implementation24 Nov 2023 Nikolai Warner, Meera Hahn, Jonathan Huang, Irfan Essa, Vighnesh Birodkar

We propose a new segmentation process, Text + Click segmentation, where a model takes as input an image, a text phrase describing a class to segment, and a single foreground click specifying the instance to segment.

Instance Segmentation Segmentation +1

Deep Tracking: Visual Tracking Using Deep Convolutional Networks

no code implementations13 Dec 2015 Meera Hahn, Si Chen, Afshin Dehghan

In this paper, we study a discriminatively trained deep convolutional network for the task of visual tracking.

Visual Tracking

Learning to Localize and Align Fine-Grained Actions to Sparse Instructions

no code implementations22 Sep 2018 Meera Hahn, Nataniel Ruiz, Jean-Baptiste Alayrac, Ivan Laptev, James M. Rehg

Automatic generation of textual video descriptions that are time-aligned with video content is a long-standing goal in computer vision.

Object Object Recognition

Action2Vec: A Crossmodal Embedding Approach to Action Learning

no code implementations2 Jan 2019 Meera Hahn, Andrew Silva, James M. Rehg

We describe a novel cross-modal embedding space for actions, named Action2Vec, which combines linguistic cues from class labels with spatio-temporal features derived from video clips.

Action Recognition General Classification +2

Tripping through time: Efficient Localization of Activities in Videos

no code implementations22 Apr 2019 Meera Hahn, Asim Kadav, James M. Rehg, Hans Peter Graf

Localizing moments in untrimmed videos via language queries is a new and interesting task that requires the ability to accurately ground language into video.

Transformer-based Localization from Embodied Dialog with Large-scale Pre-training

no code implementations10 Oct 2022 Meera Hahn, James M. Rehg

We address the challenging task of Localization via Embodied Dialog (LED).

Learning a Visually Grounded Memory Assistant

no code implementations7 Oct 2022 Meera Hahn, Kevin Carlberg, Ruta Desai, James Hillis

We introduce a novel interface for large scale collection of human memory and assistance.

Which way is `right'?: Uncovering limitations of Vision-and-Language Navigation model

no code implementations30 Nov 2023 Meera Hahn, Amit Raj, James M. Rehg

The challenging task of Vision-and-Language Navigation (VLN) requires embodied agents to follow natural language instructions to reach a goal location or object (e. g. `walk down the hallway and turn left at the piano').

Vision and Language Navigation

Cannot find the paper you are looking for? You can Submit a new open access paper.