Search Results for author: Naoki Yokoyama

Found 7 papers, 2 papers with code

Success Weighted by Completion Time: A Dynamics-Aware Evaluation Criteria for Embodied Navigation

no code implementations14 Mar 2021 Naoki Yokoyama, Sehoon Ha, Dhruv Batra

Several related works on navigation have used Success weighted by Path Length (SPL) as the primary method of evaluating the path an agent makes to a goal location, but SPL is limited in its ability to properly evaluate agents with complex dynamics.

Navigate

Benchmarking Augmentation Methods for Learning Robust Navigation Agents: the Winning Entry of the 2021 iGibson Challenge

no code implementations22 Sep 2021 Naoki Yokoyama, Qian Luo, Dhruv Batra, Sehoon Ha

Recent advances in deep reinforcement learning and scalable photorealistic simulation have led to increasingly mature embodied AI for various visual tasks, including navigation.

Benchmarking Image Augmentation +4

Is Mapping Necessary for Realistic PointGoal Navigation?

1 code implementation CVPR 2022 Ruslan Partsey, Erik Wijmans, Naoki Yokoyama, Oles Dobosevych, Dhruv Batra, Oleksandr Maksymets

However, for PointNav in a realistic setting (RGB-D and actuation noise, no GPS+Compass), this is an open question; one we tackle in this paper.

Data Augmentation Navigate +3

ViNL: Visual Navigation and Locomotion Over Obstacles

1 code implementation26 Oct 2022 Simar Kareer, Naoki Yokoyama, Dhruv Batra, Sehoon Ha, Joanne Truong

ViNL consists of: (1) a visual navigation policy that outputs linear and angular velocity commands that guides the robot to a goal coordinate in unfamiliar indoor environments; and (2) a visual locomotion policy that controls the robot's joints to avoid stepping on obstacles while following provided velocity commands.

Navigate Visual Navigation

OVRL-V2: A simple state-of-art baseline for ImageNav and ObjectNav

no code implementations14 Mar 2023 Karmesh Yadav, Arjun Majumdar, Ram Ramrakhya, Naoki Yokoyama, Alexei Baevski, Zsolt Kira, Oleksandr Maksymets, Dhruv Batra

We present a single neural network architecture composed of task-agnostic components (ViTs, convolutions, and LSTMs) that achieves state-of-art results on both the ImageNav ("go to location in <this picture>") and ObjectNav ("find a chair") tasks without any task-specific modules like object detection, segmentation, mapping, or planning modules.

object-detection Object Detection +3

VLFM: Vision-Language Frontier Maps for Zero-Shot Semantic Navigation

no code implementations6 Dec 2023 Naoki Yokoyama, Sehoon Ha, Dhruv Batra, Jiuguang Wang, Bernadette Bucher

Understanding how humans leverage semantic knowledge to navigate unfamiliar environments and decide where to explore next is pivotal for developing robots capable of human-like search behaviors.

Language Modelling Navigate

Cannot find the paper you are looking for? You can Submit a new open access paper.