Robot Navigation

130 papers with code • 4 benchmarks • 14 datasets

The fundamental objective of mobile Robot Navigation is to arrive at a goal position without collision. The mobile robot is supposed to be aware of obstacles and move freely in different working scenarios.

Source: Learning to Navigate from Simulation via Spatial and Semantic Information Synthesis with Noise Model Embedding

Benchmarks

Add a Result

These leaderboards are used to track progress in Robot Navigation

Dataset	Best Model	Compare
Habitat 2020 Object Nav test-std	THDA	See all
Habitat 2020 Point Nav test-std	VO	See all
Habitat 2020 Point Nav minival	VO	See all
Habitat 2020 Object Nav minival	SemExp	See all

Libraries

Use these libraries to find Robot Navigation models and implementations

facebookresearch/habitat-challenge

3 papers

292

Datasets

Subtasks

Latest papers with no code

Most implemented Social Latest No code

Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V

no code yet • 16 Apr 2024

Autonomous robot navigation and manipulation in open environments require reasoning and replanning with closed-loop feedback.

Paper
Add Code

JRDB-PanoTrack: An Open-world Panoptic Segmentation and Tracking Robotic Dataset in Crowded Human Environments

no code yet • 2 Apr 2024

JRDB-PanoTrack includes (1) various data involving indoor and outdoor crowded scenes, as well as comprehensive 2D and 3D synchronized data modalities; (2) high-quality 2D spatial panoptic segmentation and temporal tracking annotations, with additional 3D label projections for further spatial understanding; (3) diverse object classes for closed- and open-world recognition benchmarks, with OSPA-based metrics for evaluation.

Paper
Add Code

IVLMap: Instance-Aware Visual Language Grounding for Consumer Robot Navigation

no code yet • 28 Mar 2024

To address this challenge, we propose a new method, namely, Instance-aware Visual Language Map (IVLMap), to empower the robot with instance-level and attribute-level semantic mapping, where it is autonomously constructed by fusing the RGBD video data collected from the robot agent with special-designed natural language map indexing in the bird's-in-eye view.

Paper
Add Code

Hierarchical Open-Vocabulary 3D Scene Graphs for Language-Grounded Robot Navigation

no code yet • 26 Mar 2024

Recent open-vocabulary robot mapping methods enrich dense geometric maps with pre-trained visual-language features.

Paper
Add Code

SRLM: Human-in-Loop Interactive Social Robot Navigation with Large Language Model and Deep Reinforcement Learning

no code yet • 22 Mar 2024

An interactive social robotic assistant must provide services in complex and crowded spaces while adapting its behavior based on real-time human language commands or feedback.

Paper
Add Code

NeuPAN: Direct Point Robot Navigation with End-to-End Model-based Learning

no code yet • 11 Mar 2024

Navigating a nonholonomic robot in a cluttered environment requires extremely accurate perception and locomotion for collision avoidance.

Paper
Add Code

Single-image camera calibration with model-free distortion correction

no code yet • 2 Mar 2024

Camera calibration is a process of paramount importance in computer vision applications that require accurate quantitative measurements.

Paper
Add Code

UniMODE: Unified Monocular 3D Object Detection

no code yet • 28 Feb 2024

To address these challenges, we build a detector based on the bird's-eye-view (BEV) detection paradigm, where the explicit feature projection is beneficial to addressing the geometry learning ambiguity when employing multiple scenarios of data to train detectors.

Paper
Add Code

BioDrone: A Bionic Drone-based Single Object Tracking Benchmark for Robust Vision

no code yet • 7 Feb 2024

These challenges are especially manifested in videos captured by unmanned aerial vehicles (UAV), where the target is usually far away from the camera and often with significant motion relative to the camera.

Paper
Add Code

Vision-Language Models Provide Promptable Representations for Reinforcement Learning

no code yet • 5 Feb 2024

We find that our policies trained on embeddings extracted from general-purpose VLMs outperform equivalent policies trained on generic, non-promptable image embeddings.

Paper
Add Code

Robot Navigation

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result