no code implementations • 15 May 2025 • Zhe Xin, Chenyang Wu, Penghui Huang, Yanyong Zhang, Yinian Mao, Guoquan Huang
In tracking, we introduce feature-alignment warping constraints to alleviate the adverse effects of appearance similarity in rendering losses.
no code implementations • 10 May 2025 • Xingchen Li, Lidian Wang, Yu Sheng, Zhipeng Tang, Haojie Ren, Guoliang You, Yifan Duan, Jianmin Ji, Yanyong Zhang
To address this challenge, we present ElectricSight, a system designed for 3D distance measurement and monitoring of potential hazards to power transmission lines.
no code implementations • 20 Apr 2025 • YuTing Huang, Leilei Ding, Zhipeng Tang, Tianfu Wang, Xinrui Lin, Wuyang Zhang, Mingxiao Ma, Yanyong Zhang
Large Language Models (LLMs) exhibit substantial promise in enhancing task-planning capabilities within embodied agents due to their advanced reasoning and comprehension.
no code implementations • 17 Apr 2025 • Shumin Wang, Zhuoran Yang, Lidian Wang, Zhipeng Tang, Heng Li, Lehan Pan, Sha Zhang, Jie Peng, Jianmin Ji, Yanyong Zhang
The significant achievements of pre-trained models leveraging large volumes of data in the field of NLP and 2D vision inspire us to explore the potential of extensive data pre-training for 3D perception in autonomous driving.
1 code implementation • 9 Apr 2025 • JunRui Zhang, Chenjie Wang, Jie Peng, Haoyu Li, Jianmin Ji, Yu Zhang, Yanyong Zhang
However, open-loop training on the nuPlan dataset tends to cause causal confusion during closed-loop testing, and the dataset also presents a long-tail distribution of scenarios.
no code implementations • 20 Mar 2025 • Xiaomeng Chu, Jiajun Deng, Guoliang You, Wei Liu, Xingchen Li, Jianmin Ji, Yanyong Zhang
In this work, we propose GraspCoT, a 6-DoF grasp detection framework that integrates a Chain-of-Thought (CoT) reasoning mechanism oriented to physical properties, guided by auxiliary question-answering (QA) tasks.
no code implementations • 20 Feb 2025 • Yedong Shen, Xinran Zhang, Yifan Duan, Shiqi Zhang, Heng Li, Yilong Wu, Jianmin Ji, Yanyong Zhang
Accurate and realistic 3D scene reconstruction enables the lifelike creation of autonomous driving simulation environments.
no code implementations • 17 Dec 2024 • Xiaomeng Chu, Jiajun Deng, Guoliang You, Yifan Duan, Houqiang Li, Yanyong Zhang
For the latter, we initially incorporate a radar-guided depth head to refine the transformation from image view to BEV.
no code implementations • 4 Nov 2024 • Xinran Zhang, Hanqi Zhu, Yifan Duan, Wuyang Zhang, Longfei Shangguan, Yu Zhang, Jianmin Ji, Yanyong Zhang
We realized this approach by developing Map++, an efficient system that functions as a plug-and-play extension, supporting participatory map-building based on existing SLAM algorithms.
no code implementations • 17 Oct 2024 • Jie Peng, Zhang Cao, Huaizhi Qu, Zhengyu Zhang, Chang Guo, Yanyong Zhang, Zhichao Cao, Tianlong Chen
To enhance communication efficiency, M2Cache maintains a neuron-level mixed-precision LRU cache in HBM, a larger layer-aware cache in DRAM, and a full model in SSD.
no code implementations • 21 Sep 2024 • Guoliang You, Xiaomeng Chu, Yifan Duan, Xingchen Li, Sha Zhang, Jianmin Ji, Yanyong Zhang
For performance, the lane-level cross-modal query integration and feature enhancement module uses confidence score from ROI to combine low-confidence image queries with LiDAR queries, extracting complementary depth features.
1 code implementation • 19 Aug 2024 • Heng Li, Yuenan Hou, Xiaohan Xing, Xiao Sun, Yanyong Zhang
Inspired by the global modeling and linear computation complexity of the Mamba architecture, we present the first Mamba-based network for semantic occupancy prediction, termed OccMamba.
no code implementations • 23 Jul 2024 • Jiayu Lei, Xiaoman Zhang, Chaoyi Wu, Lisong Dai, Ya zhang, Yanyong Zhang, Yanfeng Wang, Weidi Xie, Yuehua Li
To address these challenges, we initiate a series of work on grounded Automatic Report Generation (AutoRG), starting from the brain MRI interpretation system, which supports the delineation of brain structures, the localization of anomalies, and the generation of well-organized findings.
no code implementations • 20 Jul 2024 • Xiaomeng Chu, Jiajun Deng, Guoliang You, Yifan Duan, Yao Li, Yanyong Zhang
To extract unique object-level features that cater to distinct queries, we design a ray sampling method that suitably organizes the distribution of feature sampling points on both images and bird's eye view.
no code implementations • 16 Jul 2024 • Guoliang You, Xiaomeng Chu, Yifan Duan, Wenyu Zhang, Xingchen Li, Sha Zhang, Yao Li, Jianmin Ji, Yanyong Zhang
In this work, we endeavor to integrate the perception of these elements into the planning task.
no code implementations • 2 Jul 2024 • Wenhao Yu, Jie Peng, Huanyu Yang, JunRui Zhang, Yifan Duan, Jianmin Ji, Yanyong Zhang
The complex conditional distribution in local navigation needs training data to include diverse policy in diverse real-world scenarios; (2) Myopic Observation.
no code implementations • 2 Jul 2024 • Minghui Wu, Luzhen Xu, Jie Zhang, Haitao Tang, Yanyan Yue, Ruizhi Liao, Jintao Zhao, Zhengzhe Zhang, Yichi Wang, Haoyin Yan, Hongliang Yu, Tongle Ma, Jiachen Liu, Chongliang Wu, Yongchao Li, Yanyong Zhang, Xin Fang, Yue Zhang
This report describes the submitted system to the In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) challenge, which considers the ASR task with multi-speaker overlapping and Mandarin accent dynamics in the ICMC case.
1 code implementation • 25 Jun 2024 • Jianyu Wei, Shijie Cao, Ting Cao, Lingxiao Ma, Lei Wang, Yanyong Zhang, Mao Yang
Weight quantization is crucial for reducing the memory footprint of LLMs on devices.
no code implementations • 5 Jun 2024 • Xinrui Lin, Yangfan Wu, Huanyu Yang, Yu Zhang, Yanyong Zhang, Jianmin Ji
This plan is then refined by an ASP program with a robot's action knowledge, which integrates implementation details into the skeleton, grounding the LLM's abstract outputs in practical robot contexts.
no code implementations • 5 Apr 2024 • Chenyang Wu, Yifan Duan, Xinran Zhang, Yu Sheng, Jianmin Ji, Yanyong Zhang
In this work, we present MM-Gaussian, a LiDAR-camera multi-modal fusion system for localization and mapping in unbounded scenes.
no code implementations • 4 Apr 2024 • Beibei Wang, Shuang Meng, Lu Zhang, Chenjie Wang, Jingjing Huang, Yao Li, Haojie Ren, Yuxuan Xiao, Yuru Peng, Jianmin Ji, Yu Zhang, Yanyong Zhang
Numerous roadside perception datasets have been introduced to propel advancements in autonomous driving and intelligent transportation systems research and development.
1 code implementation • 18 Mar 2024 • Sha Zhang, Jiajun Deng, Lei Bai, Houqiang Li, Wanli Ouyang, Yanyong Zhang
We present a hybrid-view-based knowledge distillation framework, termed HVDistill, to guide the feature learning of a point cloud neural network with a pre-trained image network in an unsupervised man- ner.
no code implementations • 18 Mar 2024 • Sha Zhang, Di Huang, Jiajun Deng, Shixiang Tang, Wanli Ouyang, Tong He, Yanyong Zhang
The ability to understand and reason the 3D real world is a crucial milestone towards artificial general intelligence.
no code implementations • 14 Mar 2024 • Jiajun Deng, Sha Zhang, Feras Dayoub, Wanli Ouyang, Yanyong Zhang, Ian Reid
In particular, our PoIFusion follows the paradigm of query-based object detection, formulating object queries as dynamic 3D boxes and generating a set of PoIs based on each query box.
no code implementations • 7 Mar 2024 • Leilei Ding, Dazhong Shen, Chao Wang, Tianfu Wang, Le Zhang, Yanyong Zhang
Graph Convolutional Networks (GCNs) have become pivotal in recommendation systems for learning user and item embeddings by leveraging the user-item interaction graph's node information and topology.
no code implementations • 26 Nov 2023 • Yuxuan Xiao, Yao Li, Chengzhen Meng, Xingchen Li, Jianmin Ji, Yanyong Zhang
The fusion of LiDARs and cameras has been increasingly adopted in autonomous driving for perception tasks.
1 code implementation • 25 Oct 2023 • Xingchen Li, Yifan Duan, Beibei Wang, Haojie Ren, Guoliang You, Yu Sheng, Jianmin Ji, Yanyong Zhang
The edge features, which are prevalent in various environments, are aligned in both images and point clouds to determine the extrinsic parameters.
1 code implementation • 13 Sep 2023 • Jiayu Lei, Lisong Dai, Haoyun Jiang, Chaoyi Wu, Xiaoman Zhang, Yao Zhang, Jiangchao Yao, Weidi Xie, Yanyong Zhang, Yuehua Li, Ya zhang, Yanfeng Wang
Magnetic resonance imaging~(MRI) have played a crucial role in brain disease diagnosis, with which a range of computer-aided artificial intelligence methods have been proposed.
1 code implementation • CVPR 2023 • Yingjie Wang, Jiajun Deng, Yao Li, Jinshui Hu, Cong Liu, Yu Zhang, Jianmin Ji, Wanli Ouyang, Yanyong Zhang
LiDAR and Radar are two complementary sensing approaches in that LiDAR specializes in capturing an object's 3D shape while Radar provides longer detection ranges as well as velocity hints.
no code implementations • 4 Apr 2023 • ZiMing Wang, Yujiang Liu, Yifan Duan, Xingchen Li, Xinran Zhang, Jianmin Ji, Erbao Dong, Yanyong Zhang
In this paper, we present the USTC FLICAR Dataset, which is dedicated to the development of simultaneous localization and mapping and precise 3D reconstruction of the workspace for heavy-duty autonomous aerial work robots.
no code implementations • 22 Mar 2023 • Guoliang You, Xiaomeng Chu, Yifan Duan, Jie Peng, Jianmin Ji, Yu Zhang, Yanyong Zhang
In particular, we specify a prompt-transformer for representation conversion and propose a two-step training process to train the prompt-transformer for the target environment, while the rest of the DRL pipeline remains unchanged.
no code implementations • 22 Mar 2023 • Yuan Chen, Quecheng Qiu, Xiangyu Liu, Guangda Chen, Shunyi Yao, Jie Peng, Jianmin Ji, Yanyong Zhang
The planner learns to assign different importance to the geometric features and encourages the robot to navigate through areas that are helpful for laser localization.
no code implementations • 4 Feb 2023 • Haojie Ren, Sha Zhang, Sugang Li, Yao Li, Xinchen Li, Jianmin Ji, Yu Zhang, Yanyong Zhang
In this paper, we propose TrajMatch -- the first system that can automatically calibrate for roadside LiDARs in both time and space.
no code implementations • 13 Jan 2023 • Xiaomeng Chu, Jiajun Deng, Yuan Zhao, Jianmin Ji, Yu Zhang, Houqiang Li, Yanyong Zhang
To this end, we propose OA-BEV, a network that can be plugged into the BEV-based 3D object detection framework to bring out the objects by incorporating object-aware pseudo-3D features and depth features.
1 code implementation • 7 Nov 2022 • Yi Zhai, Yu Zhang, Shuo Liu, Xiaomeng Chu, Jie Peng, Jianmin Ji, Yanyong Zhang
Instead of extracting features from the tensor program itself, TLP extracts features from the schedule primitives.
1 code implementation • 14 Jun 2022 • Jiajun Deng, Zhengyuan Yang, Daqing Liu, Tianlang Chen, Wengang Zhou, Yanyong Zhang, Houqiang Li, Wanli Ouyang
For another, we devise Language Conditioned Vision Transformer that removes external fusion modules and reuses the uni-modal ViT for vision-language fusion at the intermediate layers.
no code implementations • 29 Nov 2021 • Hanqi Zhu, Jiajun Deng, Yu Zhang, Jianmin Ji, Qiuyu Mao, Houqiang Li, Yanyong Zhang
However, this approach often suffers from the mismatch between the resolution of point clouds and RGB images, leading to sub-optimal performance.
1 code implementation • 13 Aug 2021 • Yu'an Chen, Ruosong Ye, Ziyang Tao, Hongjian Liu, Guangda Chen, Jie Peng, Jun Ma, Yu Zhang, Jianmin Ji, Yanyong Zhang
Deep reinforcement learning (DRL) algorithms have proven effective in robot navigation, especially in unknown environments, by directly mapping perception inputs into robot control commands.
1 code implementation • 30 Jul 2021 • Jiajun Deng, Wengang Zhou, Yanyong Zhang, Houqiang Li
To this end, in this work, we regard point clouds as hollow-3D data and propose a new architecture, namely Hallucinated Hollow-3D R-CNN ($\text{H}^2$3D R-CNN), to address the problem of 3D object detection.
1 code implementation • 6 Jul 2021 • Xiaomeng Chu, Jiajun Deng, Yao Li, Zhenxun Yuan, Yanyong Zhang, Jianmin Ji, Yu Zhang
As cameras are increasingly deployed in new application domains such as autonomous driving, performing 3D object detection on monocular images becomes an important task for visual scene understanding.
no code implementations • 24 Jun 2021 • Yingjie Wang, Qiuyu Mao, Hanqi Zhu, Jiajun Deng, Yu Zhang, Jianmin Ji, Houqiang Li, Yanyong Zhang
In this survey, we first introduce the background of popular sensors used for self-driving, their data properties, and the corresponding object detection algorithms.
5 code implementations • 31 Dec 2020 • Jiajun Deng, Shaoshuai Shi, Peiwei Li, Wengang Zhou, Yanyong Zhang, Houqiang Li
In this paper, we take a slightly different viewpoint -- we find that precise positioning of raw points is not essential for high performance 3D object detection and that the coarse voxel granularity can also offer sufficient detection accuracy.
Ranked #4 on
3D Object Detection
on KITTI Cars Moderate val
no code implementations • 19 Dec 2017 • Liang Xiao, Guoan Han, Donghua Jiang, Hongzi Zhu, Yanyong Zhang, H. Vincent Poor
It is shown that, by applying reinforcement learning techniques, a mobile device can achieve an optimal communication policy without the need to know the jamming and interference model and the radio channel model in a dynamic game framework.