no code implementations • 8 Jan 2025 • Zhong Wang, Lele Ren, Yue Wen, Hesheng Wang
Recent advancements in LiDAR-Inertial Odometry (LIO) have boosted a large amount of applications.
no code implementations • 2 Jan 2025 • Jiwei Shan, Zeyu Cai, Cheng-Tai Hsieh, Shing Shin Cheng, Hesheng Wang
To address these challenges, we introduce EH-SurGS, an efficient and high-fidelity reconstruction algorithm for deformable surgical scenes.
no code implementations • 31 Dec 2024 • Cheng Yuan, Jian Jiang, Kunyi Yang, Lv Wu, Rui Wang, Zi Meng, Haonan Ping, Ziyu Xu, Yifan Zhou, Wanli Song, Hesheng Wang, Qi Dou, Yutong Ban
Surgery video segmentation is an important topic in the surgical AI field.
no code implementations • 30 Nov 2024 • Yiyuan Pan, Yunzhe Xu, Zhe Liu, Hesheng Wang
This system enables agents to maintain and expand their memory through both imaginative mechanisms and navigation actions.
no code implementations • 24 Nov 2024 • Haoang Li, Xiangqi Meng, Xingxing Zuo, Zhe Liu, Hesheng Wang, Daniel Cremers
Our method is composed of three main modules to 1) map the dynamic foreground including non-rigid humans and rigid items, 2) reconstruct the static background, and 3) localize the camera.
no code implementations • 22 Nov 2024 • Linrui Gong, Jiuming Liu, Junyi Ma, Lihao Liu, Yaonan Wang, Hesheng Wang
To address this issue, we propose a novel framework named EADReg for efficient and robust registration of LiDAR point clouds based on autoregressive diffusion models.
no code implementations • 4 Sep 2024 • Junyi Ma, Xieyuanli Chen, Wentao Bao, Jingyi Xu, Hesheng Wang
Understanding human intentions and actions through egocentric videos is important on the path to embodied artificial intelligence.
1 code implementation • 1 Sep 2024 • Huixin Zhang, Guangming Wang, Xinrui Wu, Chenfeng Xu, Mingyu Ding, Masayoshi Tomizuka, Wei Zhan, Hesheng Wang
It consists of a pyramid structure with a spatial information reuse strategy, a sequential pose initialization module, a gated hierarchical pose refinement module, and a temporal feature propagation module.
1 code implementation • 20 Aug 2024 • Yunzhe Xu, Yiyuan Pan, Zhe Liu, Hesheng Wang
Large Language Models (LLMs) have demonstrated potential in Vision-and-Language Navigation (VLN) tasks, yet current applications face challenges.
Ranked #1 on
Vision and Language Navigation
on map2seq
1 code implementation • 26 May 2024 • Tianchen Deng, Yi Zhou, Wenhua Wu, Mingrui Li, Jingwei Huang, Shuhong Liu, Yanzeng Song, Hao Zuo, Yanbo Wang, Yutao Yue, Hesheng Wang, Weidong Chen
Leveraging this information, we propose a multi-modal UAV detection, classification, and 3D tracking method for accurate UAV classification and tracking.
no code implementations • 23 May 2024 • Jiuming Liu, Jinru Han, Lihao Liu, Angelica I. Aviles-Rivero, Chaokang Jiang, Zhe Liu, Hesheng Wang
Point cloud videos can faithfully capture real-world spatial geometries and temporal dynamics, which are essential for enabling intelligent agents to understand the dynamically changing world.
1 code implementation • 23 May 2024 • Zhiheng Feng, Wenhua Wu, Tianchen Deng, Hesheng Wang
In addition, because the road surface has no thickness, 2D Gaussian surfel is more consistent with the physical reality of the road surface than 3D Gaussian sphere.
1 code implementation • 7 May 2024 • Junyi Ma, Jingyi Xu, Xieyuanli Chen, Hesheng Wang
Understanding how humans would behave during hand-object interaction is vital for applications in service robot manipulation and extended reality.
no code implementations • 2 May 2024 • Guangming Wang, Lei Pan, Songyou Peng, Shaohui Liu, Chenfeng Xu, Yanzi Miao, Wei Zhan, Masayoshi Tomizuka, Marc Pollefeys, Hesheng Wang
Meticulous 3D environment representations have been a longstanding goal in computer vision and robotics fields.
1 code implementation • 29 Mar 2024 • Tianchen Deng, Yanbo Wang, Hongle Xie, Hesheng Wang, Jingchuan Wang, Danwei Wang, Weidong Chen
Second, the occupancy scene representation is replaced with Signed Distance Field (SDF) hierarchical scene representation for high-quality reconstruction and view synthesis.
1 code implementation • 27 Mar 2024 • Jiuming Liu, Dong Zhuo, Zhiheng Feng, Siting Zhu, Chensheng Peng, Zhe Liu, Hesheng Wang
Image pixels are pre-organized as pseudo points for image-to-point structure alignment.
no code implementations • 18 Mar 2024 • Wenhua Wu, Guangming Wang, Ting Deng, Sebastian Aegidius, Stuart Shanks, Valerio Modugno, Dimitrios Kanoulas, Hesheng Wang
Recent research on Simultaneous Localization and Mapping (SLAM) based on implicit representation has shown promising results in indoor environments.
no code implementations • 18 Mar 2024 • Wenhua Wu, Qi Wang, Guangming Wang, JunPing Wang, Tiankun Zhao, Yang Liu, Dongchao Gao, Zhe Liu, Hesheng Wang
To address this, we propose EMIE-MAP, a novel method for large-scale road surface reconstruction based on explicit mesh and implicit encoding.
1 code implementation • 17 Mar 2024 • Tianchen Deng, Yaohui Chen, Leyan Zhang, Jianfei Yang, Shenghai Yuan, Jiuming Liu, Danwei Wang, Hesheng Wang, Weidong Chen
Recent work has shown that 3D Gaussian-based SLAM enables high-quality reconstruction, accurate pose estimation, and real-time rendering of scenes.
1 code implementation • 12 Mar 2024 • Siting Zhu, Renjie Qin, Guangming Wang, Jiuming Liu, Hesheng Wang
We propose SemGauss-SLAM, a dense semantic SLAM system utilizing 3D Gaussian representation, that enables accurate 3D semantic mapping, robust camera tracking, and high-quality rendering simultaneously.
1 code implementation • 11 Mar 2024 • Jiuming Liu, Ruiji Yu, Yian Wang, Yu Zheng, Tianchen Deng, Weicai Ye, Hesheng Wang
In this paper, we propose a novel SSM-based point cloud processing backbone, named Point Mamba, with a causality-aware ordering mechanism.
1 code implementation • CVPR 2024 • Chaokang Jiang, Guangming Wang, Jiuming Liu, Hesheng Wang, Zhuang Ma, Zhenqiang Liu, Zhujin Liang, Yi Shan, Dalong Du
We present a novel approach from the perspective of auto-labelling, aiming to generate a large number of 3D scene flow pseudo labels for real-world LiDAR point clouds.
1 code implementation • CVPR 2024 • Jiuming Liu, Guangming Wang, Weicai Ye, Chaokang Jiang, Jinru Han, Zhe Liu, Guofeng Zhang, Dalong Du, Hesheng Wang
Furthermore we also develop an uncertainty estimation module within diffusion to evaluate the reliability of estimated scene flow.
1 code implementation • CVPR 2024 • Junyi Ma, Xieyuanli Chen, Jiawei Huang, Jingyi Xu, Zhen Luo, Jintao Xu, Weihao Gu, Rui Ai, Hesheng Wang
Furthermore, the standardized evaluation protocol for preset multiple tasks is also provided to compare the performance of all the proposed baselines on present and future occupancy estimation with respect to objects of interest in autonomous driving scenarios.
1 code implementation • 29 Nov 2023 • Yu Zheng, Guangming Wang, Jiuming Liu, Marc Pollefeys, Hesheng Wang
Through the hash-based representation, we propose the Spherical Frustum sparse Convolution (SFC) and Frustum Fast Point Sampling (F2PS) to convolve and sample the points stored in spherical frustums respectively.
1 code implementation • 29 Nov 2023 • Jiuming Liu, Guangming Wang, Weicai Ye, Chaokang Jiang, Jinru Han, Zhe Liu, Guofeng Zhang, Dalong Du, Hesheng Wang
Furthermore, we also develop an uncertainty estimation module within diffusion to evaluate the reliability of estimated scene flow.
1 code implementation • ICCV 2023 • Chang Nie, Guangming Wang, Zhe Liu, Luca Cavalli, Marc Pollefeys, Hesheng Wang
Therefore, RLSAC can avoid differentiating to learn the features and the feedback of downstream tasks for end-to-end robust estimation.
1 code implementation • ICCV 2023 • Chensheng Peng, Guangming Wang, Xian Wan Lo, Xinrui Wu, Chenfeng Xu, Masayoshi Tomizuka, Wei Zhan, Hesheng Wang
Previous methods rarely predict scene flow from the entire point clouds of the scene with one-time inference due to the memory inefficiency and heavy overhead from distance calculation and sorting involved in commonly used farthest point sampling, KNN, and ball query algorithms for local feature aggregation.
1 code implementation • 27 Jul 2023 • Lingdong Kong, Yaru Niu, Shaoyuan Xie, Hanjiang Hu, Lai Xing Ng, Benoit R. Cottereau, Liangjun Zhang, Hesheng Wang, Wei Tsang Ooi, Ruijie Zhu, Ziyang Song, Li Liu, Tianzhu Zhang, Jun Yu, Mohan Jing, Pengwei Li, Xiaohua Qi, Cheng Jin, Yingfeng Chen, Jie Hou, Jie Zhang, Zhen Kan, Qiang Ling, Liang Peng, Minglei Li, Di Xu, Changpeng Yang, Yuanqi Yao, Gang Wu, Jian Kuai, Xianming Liu, Junjun Jiang, Jiamian Huang, Baojun Li, Jiale Chen, Shuang Zhang, Sun Ao, Zhenyu Li, Runze Chen, Haiyong Luo, Fang Zhao, Jingze Yu
In this paper, we summarize the winning solutions from the RoboDepth Challenge -- an academic competition designed to facilitate and advance robust OoD depth estimation.
no code implementations • 20 Jun 2023 • Guangming Wang, Yu Zheng, Yanfeng Guo, Zhe Liu, Yixiang Zhu, Wolfram Burgard, Hesheng Wang
A popular approach to robot localization is based on image-to-point cloud registration, which combines illumination-invariant LiDAR-based mapping with economical image-based localization.
1 code implementation • ICCV 2023 • Jiuming Liu, Guangming Wang, Zhe Liu, Chaokang Jiang, Marc Pollefeys, Hesheng Wang
Specifically, a projection-aware hierarchical transformer is proposed to capture long-range dependencies and filter outliers by extracting point features globally.
no code implementations • 27 Sep 2022 • Chaokang Jiang, Guangming Wang, Yanzi Miao, Hesheng Wang
The proposed method of self-supervised learning of 3D scene flow on real-world images is compared with a variety of methods for learning on the synthesized dataset and learning on LiDAR point clouds.
no code implementations • 15 Sep 2022 • Chaokang Jiang, Guangming Wang, Jinxing Wu, Yanzi Miao, Hesheng Wang
Promising complementarity exists between the texture features of color images and the geometric information of LiDAR point clouds.
no code implementations • 11 Sep 2022 • Guangming Wang, Zhiheng Feng, Chaokang Jiang, Hesheng Wang
Unlike the previous unsupervised learning of scene flow in point clouds, we propose to use odometry information to assist the unsupervised learning of scene flow and use real-world LiDAR data to train our network.
no code implementations • 4 Sep 2022 • Huiying Deng, Guangming Wang, Zhiheng Feng, Chaokang Jiang, Xinrui Wu, Yanzi Miao, Hesheng Wang
In order to make full use of the rich point cloud information provided by the pseudo-LiDAR, a projection-aware dense odometry pipeline is adopted.
1 code implementation • 19 Jul 2022 • Guangming Wang, Yunzhe Hu, Zhe Liu, Yiyang Zhou, Masayoshi Tomizuka, Wei Zhan, Hesheng Wang
Our proposed model surpasses all existing methods by at least 38. 2% on FlyingThings3D dataset and 24. 7% on KITTI Scene Flow dataset for EPE3D metric.
1 code implementation • 8 Jun 2022 • Guangming Wang, Xiaoyu Tian, Ruiqi Ding, Hesheng Wang
Unsupervised learning of scene flow in this paper mainly consists of two parts: (i) depth estimation and camera pose estimation, and (ii) scene flow estimation based on four different loss functions.
no code implementations • 30 Mar 2022 • Guangming Wang, Chensheng Peng, Jinpeng Zhang, Hesheng Wang
Specifically, through multi-scale interactive query and fusion between pixel-level and point-level features, our method, can obtain more distinguishing features to improve the performance of multiple object tracking.
no code implementations • 4 Mar 2022 • Yueling Shen, Guangming Wang, Hesheng Wang
we proposed a 3D MOT framework based on simultaneous optimization of object detection and scene flow estimation.
no code implementations • 6 Dec 2021 • Guangming Wang, Jiquan Zhong, Shijie Zhao, Wenhua Wu, Zhe Liu, Hesheng Wang
In this framework, the depth and pose estimations are hierarchically and mutually coupled to refine the estimated pose layer by layer.
1 code implementation • 3 Nov 2021 • Guangming Wang, Xinrui Wu, Shuyang Jiang, Zhe Liu, Hesheng Wang
An efficient 3D point cloud learning architecture, named EfficientLO-Net, for LiDAR odometry is first proposed in this paper.
no code implementations • 10 Sep 2021 • Guangming Wang, Yunzhe Hu, Xinrui Wu, Hesheng Wang
To solve the first problem, a novel context-aware set convolution layer is proposed in this paper to exploit contextual structure information of Euclidean space and learn soft aggregation weights for local point features.
no code implementations • 8 Jul 2021 • Guangming Wang, Shuaiqi Ren, Hesheng Wang
Then, two novel loss functions are proposed for the unsupervised learning of optical flow based on the geometric laws of non-occlusion.
no code implementations • 28 Jun 2021 • Guangming Wang, Honghao Zeng, Ziliang Wang, Zhe Liu, Hesheng Wang
Ablation studies demonstrate the effectiveness of the proposed inter-frame projection consistency constraints and intra-frame loop constraints.
Ranked #51 on
3D Human Pose Estimation
on Human3.6M
1 code implementation • 1 Apr 2021 • Guangming Wang, Hesheng Wang, Yiling Liu, Weidong Chen
A new unsupervised learning method of depth and ego-motion using multiple masks from monocular video is proposed in this paper.
no code implementations • 20 Dec 2020 • Guangming Wang, Muyao Chen, Hanwen Liu, Yehui Yang, Zhe Liu, Hesheng Wang
Then, anchor-based 3D convolution is adopted to aggregate these anchors' features to the core points.
1 code implementation • 9 Dec 2020 • Zhijian Qiao, Hanjiang Hu, Weiang Shi, Siyuan Chen, Zhe Liu, Hesheng Wang
In the field of large-scale SLAM for autonomous driving and mobile robotics, 3D point cloud based place recognition has aroused significant research interest due to its robustness to changing environments with drastic daytime and weather variance.
1 code implementation • CVPR 2021 • Guangming Wang, Xinrui Wu, Zhe Liu, Hesheng Wang
A novel 3D point cloud learning model for deep LiDAR odometry, named PWCLO-Net, using hierarchical embedding mask optimization is proposed in this paper.
1 code implementation • 30 Nov 2020 • Zhijian Qiao, Huanshu Wei, Zhe Liu, Chuanzhe Suo, Hesheng Wang
3D Point cloud registration is still a very challenging topic due to the difficulty in finding the rigid transformation between two point clouds with partial correspondences, and it's even harder in the absence of any initial estimation information.
no code implementations • 27 Nov 2020 • Guangming Wang, Yehui Yang, Huixin Zhang, Zhe Liu, Hesheng Wang
In this paper, a spherical interpolated convolution operator is proposed to replace the traditional grid-shaped 3D convolution operator.
no code implementations • 24 Nov 2020 • Guangming Wang, Minjian Xin, Wenhua Wu, Zhe Liu, Hesheng Wang
Deep Reinforcement Learning (DRL) enables robots to perform some intelligent tasks end-to-end.
1 code implementation • 9 Nov 2020 • Hanjiang Hu, Baoquan Yang, Zhijian Qiao, Shiqi Liu, Jiacheng Zhu, Zuxin Liu, Wenhao Ding, Ding Zhao, Hesheng Wang
Different environments pose a great challenge to the outdoor robust visual perception for long-term autonomous driving, and the generalization of learning-based algorithms on different environments is still an open problem.
no code implementations • 12 Oct 2020 • Guangming Wang, Xinrui Wu, Zhe Liu, Hesheng Wang
In this paper, a novel hierarchical neural network with double attention is proposed for learning the correlation of point features in adjacent frames and refining scene flow from coarse to fine layer by layer.
1 code implementation • 1 Oct 2020 • Hanjiang Hu, Zhijian Qiao, Ming Cheng, Zhe Liu, Hesheng Wang
Long-Term visual localization under changing environments is a challenging problem in autonomous driving and mobile robotics due to season, illumination variance, etc.
1 code implementation • 16 Sep 2020 • Hanjiang Hu, Hesheng Wang, Zhe Liu, Weidong Chen
Visual localization is a crucial component in the application of mobile robot and autonomous driving.
1 code implementation • arXiv 2020 • Guangming Wang, Chi Zhang, Hesheng Wang, Jingchuan Wang, Yong Wang, Xinlei Wang
In the occluded region, as depth and camera motion can provide more reliable motion estimation, they can be used to instruct unsupervised learning of optical flow.
1 code implementation • 23 Sep 2019 • Hanjiang Hu, Hesheng Wang, Zhe Liu, Chenguang Yang, Weidong Chen, Le Xie
To retrieve a target image from the database, the query image is first encoded using the encoder belonging to the query domain to obtain a domain-invariant feature vector.
no code implementations • 10 Sep 2019 • Shuo Yang, Wei zhang, Weizhi Lu, Hesheng Wang, Yibin Li
However, the general video captioning methods focus more on the understanding of the full frame, lacking of consideration on the specific object of interests in robotic manipulations.
no code implementations • 30 Apr 2019 • Zhe Liu, Chuanzhe Suo, Shunbo Zhou, Huanshu Wei, Yingtian Liu, Hesheng Wang, Yun-hui Liu
Place recognition and loop-closure detection are main challenges in the localization, mapping and navigation tasks of self-driving vehicles.
2 code implementations • ICCV 2019 • Zhe Liu, Shunbo Zhou, Chuanzhe Suo, Yingtian Liu, Peng Yin, Hesheng Wang, Yun-hui Liu
Point cloud based place recognition is still an open issue due to the difficulty in extracting local features from the raw 3D point cloud and generating the global descriptor, and it's even harder in the large-scale dynamic environments.
Ranked #5 on
3D Place Recognition
on CS-Campus3D