Search Results for author: Meera Hahn

Found 12 papers, 3 papers with code

No RL, No Simulation: Learning to Navigate without Navigating

1 code implementation • NeurIPS 2021 • Meera Hahn, Devendra Chaplot, Shubham Tulsiani, Mustafa Mukadam, James M. Rehg, Abhinav Gupta

Most prior methods for learning navigation policies require access to simulation environments, as they need online policy interaction and rely on ground-truth maps for rewards.

Navigate Reinforcement Learning (RL)

Paper
Code

Where Are You? Localization from Embodied Dialog

2 code implementations • EMNLP 2020 • Meera Hahn, Jacob Krantz, Dhruv Batra, Devi Parikh, James M. Rehg, Stefan Lee, Peter Anderson

In this paper, we focus on the LED task -- providing a strong baseline model with detailed ablations characterizing both dataset biases and the importance of various modeling choices.

Navigate Visual Dialog

Paper
Code

Text and Click inputs for unambiguous open vocabulary instance segmentation

1 code implementation • 24 Nov 2023 • Nikolai Warner, Meera Hahn, Jonathan Huang, Irfan Essa, Vighnesh Birodkar

We propose a new segmentation process, Text + Click segmentation, where a model takes as input an image, a text phrase describing a class to segment, and a single foreground click specifying the instance to segment.

Instance Segmentation Segmentation +1

Paper
Code

Deep Tracking: Visual Tracking Using Deep Convolutional Networks

no code implementations • 13 Dec 2015 • Meera Hahn, Si Chen, Afshin Dehghan

In this paper, we study a discriminatively trained deep convolutional network for the task of visual tracking.

Visual Tracking

Paper
Add Code

Learning to Localize and Align Fine-Grained Actions to Sparse Instructions

no code implementations • 22 Sep 2018 • Meera Hahn, Nataniel Ruiz, Jean-Baptiste Alayrac, Ivan Laptev, James M. Rehg

Automatic generation of textual video descriptions that are time-aligned with video content is a long-standing goal in computer vision.

Object Object Recognition

Paper
Add Code

Action2Vec: A Crossmodal Embedding Approach to Action Learning

no code implementations • 2 Jan 2019 • Meera Hahn, Andrew Silva, James M. Rehg

We describe a novel cross-modal embedding space for actions, named Action2Vec, which combines linguistic cues from class labels with spatio-temporal features derived from video clips.

Action Recognition General Classification +2

Paper
Add Code

Tripping through time: Efficient Localization of Activities in Videos

no code implementations • 22 Apr 2019 • Meera Hahn, Asim Kadav, James M. Rehg, Hans Peter Graf

Localizing moments in untrimmed videos via language queries is a new and interesting task that requires the ability to accurately ground language into video.

Paper
Add Code

Transformer-based Localization from Embodied Dialog with Large-scale Pre-training

no code implementations • 10 Oct 2022 • Meera Hahn, James M. Rehg

We address the challenging task of Localization via Embodied Dialog (LED).

Paper
Add Code

Learning a Visually Grounded Memory Assistant

no code implementations • 7 Oct 2022 • Meera Hahn, Kevin Carlberg, Ruta Desai, James Hillis

We introduce a novel interface for large scale collection of human memory and assistance.

Paper
Add Code

Which way is `right'?: Uncovering limitations of Vision-and-Language Navigation model

no code implementations • 30 Nov 2023 • Meera Hahn, Amit Raj, James M. Rehg

The challenging task of Vision-and-Language Navigation (VLN) requires embodied agents to follow natural language instructions to reach a goal location or object (e. g. `walk down the hallway and turn left at the piano').

Vision and Language Navigation