1 code implementation • 5 Jul 2024 • Yuxuan Kuang, Junjie Ye, Haoran Geng, Jiageng Mao, Congyue Deng, Leonidas Guibas, He Wang, Yue Wang
First, RAM extracts unified affordance at scale from diverse sources of demonstrations including robotic data, human-object interaction (HOI) data, and custom data to construct a comprehensive affordance memory.
no code implementations • 6 Jun 2024 • Sergio Casas, Ben Agro, Jiageng Mao, Thomas Gilles, Alexander Cui, Thomas Li, Raquel Urtasun
The tasks of object detection and trajectory forecasting play a crucial role in understanding the scene for autonomous driving.
no code implementations • CVPR 2024 • Boyi Li, Yue Wang, Jiageng Mao, Boris Ivanovic, Sushant Veer, Karen Leung, Marco Pavone
Adapting driving behavior to new environments, customs, and laws is a long-standing problem in autonomous driving, precluding the widespread deployment of autonomous vehicles (AVs).
1 code implementation • 17 Nov 2023 • Jiageng Mao, Junjie Ye, Yuxi Qian, Marco Pavone, Yue Wang
Human-level driving is an ultimate goal of autonomous driving.
1 code implementation • 2 Oct 2023 • Jiageng Mao, Yuxi Qian, Junjie Ye, Hang Zhao, Yue Wang
In this paper, we propose a novel approach to motion planning that capitalizes on the strong reasoning capabilities and generalization potential inherent to Large Language Models (LLMs).
Ranked #1 on Motion Planning on nuScenes
no code implementations • 22 Mar 2023 • Yihan Zeng, Chenhan Jiang, Jiageng Mao, Jianhua Han, Chaoqiang Ye, Qingqiu Huang, Dit-yan Yeung, Zhen Yang, Xiaodan Liang, Hang Xu
Contrastive Language-Image Pre-training, benefiting from large-scale unlabeled text-image pairs, has demonstrated great performance in open-world vision understanding tasks.
Ranked #3 on Zero-shot 3D Point Cloud Classification on ScanNetV2
no code implementations • CVPR 2023 • Yihan Zeng, Chenhan Jiang, Jiageng Mao, Jianhua Han, Chaoqiang Ye, Qingqiu Huang, Dit-yan Yeung, Zhen Yang, Xiaodan Liang, Hang Xu
Contrastive Language-Image Pre-training, benefiting from large-scale unlabeled text-image pairs, has demonstrated great performance in open-world vision understanding tasks.
1 code implementation • 19 Jun 2022 • Jiageng Mao, Shaoshuai Shi, Xiaogang Wang, Hongsheng Li
Autonomous driving, in recent years, has been receiving increasing attention for its potential to relieve drivers' burdens and improve the safety of driving.
1 code implementation • CVPR 2022 • Yujing Xue, Jiageng Mao, Minzhe Niu, Hang Xu, Michael Bi Mi, Wei zhang, Xiaogang Wang, Xinchao Wang
We further propose a lightweight scene-to-sequence decoder that can auto-regressively generate words conditioned on features from a 3D scene as well as cues from the preceding words.
1 code implementation • ICCV 2021 • Jiageng Mao, Minzhe Niu, Haoyue Bai, Xiaodan Liang, Hang Xu, Chunjing Xu
To resolve the problems, we propose a novel second-stage module, named pyramid RoI head, to adaptively learn the features from the sparse points of interest.
Ranked #2 on 3D Object Detection on waymo vehicle (AP metric)
1 code implementation • ICCV 2021 • Jiageng Mao, Yujing Xue, Minzhe Niu, Haoyue Bai, Jiashi Feng, Xiaodan Liang, Hang Xu, Chunjing Xu
We present Voxel Transformer (VoTr), a novel and effective voxel-based Transformer backbone for 3D object detection from point clouds.
Ranked #3 on 3D Object Detection on waymo vehicle (L1 mAP metric)
no code implementations • 21 Jun 2021 • Jianhua Han, Xiwen Liang, Hang Xu, Kai Chen, Lanqing Hong, Jiageng Mao, Chaoqiang Ye, Wei zhang, Zhenguo Li, Xiaodan Liang, Chunjing Xu
Experiments show that SODA10M can serve as a promising pre-training dataset for different self-supervised learning methods, which gives superior performance when fine-tuning with different downstream tasks (i. e., detection, semantic/instance segmentation) in autonomous driving domain.
1 code implementation • 21 Jun 2021 • Jiageng Mao, Minzhe Niu, Chenhan Jiang, Hanxue Liang, Jingheng Chen, Xiaodan Liang, Yamin Li, Chaoqiang Ye, Wei zhang, Zhenguo Li, Jie Yu, Hang Xu, Chunjing Xu
To facilitate future research on exploiting unlabeled data for 3D detection, we additionally provide a benchmark in which we reproduce and evaluate a variety of self-supervised and semi-supervised methods on the ONCE dataset.
1 code implementation • 31 Dec 2020 • Haoyue Bai, Jiageng Mao, S. -H. Gary Chan
Single image crowd counting is a challenging computer vision problem with wide applications in public safety, city planning, traffic management, etc.
1 code implementation • ECCV 2020 • Haozhe Xie, Hongxun Yao, Shangchen Zhou, Jiageng Mao, Shengping Zhang, Wenxiu Sun
In particular, we devise two novel differentiable layers, named Gridding and Gridding Reverse, to convert between point clouds and 3D grids without losing structural information.
Ranked #3 on Point Cloud Completion on Completion3D
no code implementations • ICCV 2019 • Jiageng Mao, Xiaogang Wang, Hongsheng Li
Our InterpConv is shown to be permutation and sparsity invariant, and can directly handle irregular inputs.
Ranked #28 on 3D Part Segmentation on ShapeNet-Part