Search Results for author: Jiageng Mao

Found 16 papers, 10 papers with code

RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation

1 code implementation5 Jul 2024 Yuxuan Kuang, Junjie Ye, Haoran Geng, Jiageng Mao, Congyue Deng, Leonidas Guibas, He Wang, Yue Wang

First, RAM extracts unified affordance at scale from diverse sources of demonstrations including robotic data, human-object interaction (HOI) data, and custom data to construct a comprehensive affordance memory.

Human-Object Interaction Detection Retrieval

DeTra: A Unified Model for Object Detection and Trajectory Forecasting

no code implementations6 Jun 2024 Sergio Casas, Ben Agro, Jiageng Mao, Thomas Gilles, Alexander Cui, Thomas Li, Raquel Urtasun

The tasks of object detection and trajectory forecasting play a crucial role in understanding the scene for autonomous driving.

Autonomous Driving object-detection +2

Driving Everywhere with Large Language Model Policy Adaptation

no code implementations CVPR 2024 Boyi Li, Yue Wang, Jiageng Mao, Boris Ivanovic, Sushant Veer, Karen Leung, Marco Pavone

Adapting driving behavior to new environments, customs, and laws is a long-standing problem in autonomous driving, precluding the widespread deployment of autonomous vehicles (AVs).

Autonomous Driving Language Modelling +2

GPT-Driver: Learning to Drive with GPT

1 code implementation2 Oct 2023 Jiageng Mao, Yuxi Qian, Junjie Ye, Hang Zhao, Yue Wang

In this paper, we propose a novel approach to motion planning that capitalizes on the strong reasoning capabilities and generalization potential inherent to Large Language Models (LLMs).

Autonomous Driving Decision Making +2

CLIP2: Contrastive Language-Image-Point Pretraining From Real-World Point Cloud Data

no code implementations CVPR 2023 Yihan Zeng, Chenhan Jiang, Jiageng Mao, Jianhua Han, Chaoqiang Ye, Qingqiu Huang, Dit-yan Yeung, Zhen Yang, Xiaodan Liang, Hang Xu

Contrastive Language-Image Pre-training, benefiting from large-scale unlabeled text-image pairs, has demonstrated great performance in open-world vision understanding tasks.

3D geometry

3D Object Detection for Autonomous Driving: A Comprehensive Survey

1 code implementation19 Jun 2022 Jiageng Mao, Shaoshuai Shi, Xiaogang Wang, Hongsheng Li

Autonomous driving, in recent years, has been receiving increasing attention for its potential to relieve drivers' burdens and improve the safety of driving.

3D Object Detection Autonomous Driving +1

Point2Seq: Detecting 3D Objects as Sequences

1 code implementation CVPR 2022 Yujing Xue, Jiageng Mao, Minzhe Niu, Hang Xu, Michael Bi Mi, Wei zhang, Xiaogang Wang, Xinchao Wang

We further propose a lightweight scene-to-sequence decoder that can auto-regressively generate words conditioned on features from a 3D scene as well as cues from the preceding words.

3D Object Detection Decoder +2

Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection

1 code implementation ICCV 2021 Jiageng Mao, Minzhe Niu, Haoyue Bai, Xiaodan Liang, Hang Xu, Chunjing Xu

To resolve the problems, we propose a novel second-stage module, named pyramid RoI head, to adaptively learn the features from the sparse points of interest.

3D Object Detection object-detection

Voxel Transformer for 3D Object Detection

1 code implementation ICCV 2021 Jiageng Mao, Yujing Xue, Minzhe Niu, Haoyue Bai, Jiashi Feng, Xiaodan Liang, Hang Xu, Chunjing Xu

We present Voxel Transformer (VoTr), a novel and effective voxel-based Transformer backbone for 3D object detection from point clouds.

Ranked #3 on 3D Object Detection on waymo vehicle (L1 mAP metric)

3D Object Detection Computational Efficiency +3

SODA10M: A Large-Scale 2D Self/Semi-Supervised Object Detection Dataset for Autonomous Driving

no code implementations21 Jun 2021 Jianhua Han, Xiwen Liang, Hang Xu, Kai Chen, Lanqing Hong, Jiageng Mao, Chaoqiang Ye, Wei zhang, Zhenguo Li, Xiaodan Liang, Chunjing Xu

Experiments show that SODA10M can serve as a promising pre-training dataset for different self-supervised learning methods, which gives superior performance when fine-tuning with different downstream tasks (i. e., detection, semantic/instance segmentation) in autonomous driving domain.

Autonomous Driving Instance Segmentation +5

One Million Scenes for Autonomous Driving: ONCE Dataset

1 code implementation21 Jun 2021 Jiageng Mao, Minzhe Niu, Chenhan Jiang, Hanxue Liang, Jingheng Chen, Xiaodan Liang, Yamin Li, Chaoqiang Ye, Wei zhang, Zhenguo Li, Jie Yu, Hang Xu, Chunjing Xu

To facilitate future research on exploiting unlabeled data for 3D detection, we additionally provide a benchmark in which we reproduce and evaluate a variety of self-supervised and semi-supervised methods on the ONCE dataset.

3D Object Detection Autonomous Driving +1

A Survey on Deep Learning-based Single Image Crowd Counting: Network Design, Loss Function and Supervisory Signal

1 code implementation31 Dec 2020 Haoyue Bai, Jiageng Mao, S. -H. Gary Chan

Single image crowd counting is a challenging computer vision problem with wide applications in public safety, city planning, traffic management, etc.

Crowd Counting Management

GRNet: Gridding Residual Network for Dense Point Cloud Completion

1 code implementation ECCV 2020 Haozhe Xie, Hongxun Yao, Shangchen Zhou, Jiageng Mao, Shengping Zhang, Wenxiu Sun

In particular, we devise two novel differentiable layers, named Gridding and Gridding Reverse, to convert between point clouds and 3D grids without losing structural information.

Point Cloud Completion

Cannot find the paper you are looking for? You can Submit a new open access paper.