Search Results for author: Yanyong Zhang

Found 43 papers, 13 papers with code

Large-Scale Gaussian Splatting SLAM

no code implementations15 May 2025 Zhe Xin, Chenyang Wu, Penghui Huang, Yanyong Zhang, Yinian Mao, Guoquan Huang

In tracking, we introduce feature-alignment warping constraints to alleviate the adverse effects of appearance similarity in rendering losses.

ElectricSight: 3D Hazard Monitoring for Power Lines Using Low-Cost Sensors

no code implementations10 May 2025 Xingchen Li, Lidian Wang, Yu Sheng, Zhipeng Tang, Haojie Ren, Guoliang You, Yifan Duan, Jianmin Ji, Yanyong Zhang

To address this challenge, we present ElectricSight, a system designed for 3D distance measurement and monitoring of potential hazards to power transmission lines.

Monocular Depth Estimation

A Framework for Benchmarking and Aligning Task-Planning Safety in LLM-Based Embodied Agents

no code implementations20 Apr 2025 YuTing Huang, Leilei Ding, Zhipeng Tang, Tianfu Wang, Xinrui Lin, Wuyang Zhang, Mingxiao Ma, Yanyong Zhang

Large Language Models (LLMs) exhibit substantial promise in enhancing task-planning capabilities within embodied agents due to their advanced reasoning and comprehension.

Benchmarking Task Planning

Self-Supervised Pre-training with Combined Datasets for 3D Perception in Autonomous Driving

no code implementations17 Apr 2025 Shumin Wang, Zhuoran Yang, Lidian Wang, Zhipeng Tang, Heng Li, Lehan Pan, Sha Zhang, Jie Peng, Jianmin Ji, Yanyong Zhang

The significant achievements of pre-trained models leveraging large volumes of data in the field of NLP and 2D vision inspire us to explore the potential of extensive data pre-training for 3D perception in autonomous driving.

3D Object Detection 3D Object Tracking +5

CAFE-AD: Cross-Scenario Adaptive Feature Enhancement for Trajectory Planning in Autonomous Driving

1 code implementation9 Apr 2025 JunRui Zhang, Chenjie Wang, Jie Peng, Haoyu Li, Jianmin Ji, Yu Zhang, Yanyong Zhang

However, open-loop training on the nuPlan dataset tends to cause causal confusion during closed-loop testing, and the dataset also presents a long-tail distribution of scenarios.

Autonomous Driving Feature Importance +2

GraspCoT: Integrating Physical Property Reasoning for 6-DoF Grasping under Flexible Language Instructions

no code implementations20 Mar 2025 Xiaomeng Chu, Jiajun Deng, Guoliang You, Wei Liu, Xingchen Li, Jianmin Ji, Yanyong Zhang

In this work, we propose GraspCoT, a 6-DoF grasp detection framework that integrates a Chain-of-Thought (CoT) reasoning mechanism oriented to physical properties, guided by auxiliary question-answering (QA) tasks.

Question Answering

OG-Gaussian: Occupancy Based Street Gaussians for Autonomous Driving

no code implementations20 Feb 2025 Yedong Shen, Xinran Zhang, Yifan Duan, Shiqi Zhang, Heng Li, Yilong Wu, Jianmin Ji, Yanyong Zhang

Accurate and realistic 3D scene reconstruction enables the lifelike creation of autonomous driving simulation environments.

3DGS 3D Scene Reconstruction +1

Map++: Towards User-Participatory Visual SLAM Systems with Efficient Map Expansion and Sharing

no code implementations4 Nov 2024 Xinran Zhang, Hanqi Zhu, Yifan Duan, Wuyang Zhang, Longfei Shangguan, Yu Zhang, Jianmin Ji, Yanyong Zhang

We realized this approach by developing Map++, an efficient system that functions as a plug-and-play extension, supporting participatory map-building based on existing SLAM algorithms.

Harnessing Your DRAM and SSD for Sustainable and Accessible LLM Inference with Mixed-Precision and Multi-level Caching

no code implementations17 Oct 2024 Jie Peng, Zhang Cao, Huaizhi Qu, Zhengyu Zhang, Chang Guo, Yanyong Zhang, Zhichao Cao, Tianlong Chen

To enhance communication efficiency, M2Cache maintains a neuron-level mixed-precision LRU cache in HBM, a larger layer-aware cache in DRAM, and a full model in SSD.

Quantization

LFP: Efficient and Accurate End-to-End Lane-Level Planning via Camera-LiDAR Fusion

no code implementations21 Sep 2024 Guoliang You, Xiaomeng Chu, Yifan Duan, Xingchen Li, Sha Zhang, Jianmin Ji, Yanyong Zhang

For performance, the lane-level cross-modal query integration and feature enhancement module uses confidence score from ROI to combine low-confidence image queries with LiDAR queries, extracting complementary depth features.

Autonomous Driving Sensor Fusion

OccMamba: Semantic Occupancy Prediction with State Space Models

1 code implementation19 Aug 2024 Heng Li, Yuenan Hou, Xiaohan Xing, Xiao Sun, Yanyong Zhang

Inspired by the global modeling and linear computation complexity of the Mamba architecture, we present the first Mamba-based network for semantic occupancy prediction, termed OccMamba.

Mamba Prediction +1

AutoRG-Brain: Grounded Report Generation for Brain MRI

no code implementations23 Jul 2024 Jiayu Lei, Xiaoman Zhang, Chaoyi Wu, Lisong Dai, Ya zhang, Yanyong Zhang, Yanfeng Wang, Weidi Xie, Yuehua Li

To address these challenges, we initiate a series of work on grounded Automatic Report Generation (AutoRG), starting from the brain MRI interpretation system, which supports the delineation of brain structures, the localization of anomalies, and the generation of well-organized findings.

Anomaly Localization Anomaly Segmentation

RayFormer: Improving Query-Based Multi-Camera 3D Object Detection via Ray-Centric Strategies

no code implementations20 Jul 2024 Xiaomeng Chu, Jiajun Deng, Guoliang You, Yifan Duan, Yao Li, Yanyong Zhang

To extract unique object-level features that cater to distinct queries, we design a ray sampling method that suitably organizes the distribution of feature sampling points on both images and bird's eye view.

3D Object Detection Object +1

LDP: A Local Diffusion Planner for Efficient Robot Navigation and Collision Avoidance

no code implementations2 Jul 2024 Wenhao Yu, Jie Peng, Huanyu Yang, JunRui Zhang, Yifan Duan, Jianmin Ji, Yanyong Zhang

The complex conditional distribution in local navigation needs training data to include diverse policy in diverse real-world scenarios; (2) Myopic Observation.

Collision Avoidance Robot Navigation

The USTC-NERCSLIP Systems for The ICMC-ASR Challenge

no code implementations2 Jul 2024 Minghui Wu, Luzhen Xu, Jie Zhang, Haitao Tang, Yanyan Yue, Ruizhi Liao, Jintao Zhao, Zhengzhe Zhang, Yichi Wang, Haoyin Yan, Hongliang Yu, Tongle Ma, Jiachen Liu, Chongliang Wu, Yongchao Li, Yanyong Zhang, Xin Fang, Yue Zhang

This report describes the submitted system to the In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) challenge, which considers the ASR task with multi-speaker overlapping and Mandarin accent dynamics in the ICMC case.

Automatic Speech Recognition Pseudo Label +5

CLMASP: Coupling Large Language Models with Answer Set Programming for Robotic Task Planning

no code implementations5 Jun 2024 Xinrui Lin, Yangfan Wu, Huanyu Yang, Yu Zhang, Yanyong Zhang, Jianmin Ji

This plan is then refined by an ASP program with a robot's action knowledge, which integrates implementation details into the skeleton, grounding the LLM's abstract outputs in practical robot contexts.

Task Planning

MM-Gaussian: 3D Gaussian-based Multi-modal Fusion for Localization and Reconstruction in Unbounded Scenes

no code implementations5 Apr 2024 Chenyang Wu, Yifan Duan, Xinran Zhang, Yu Sheng, Jianmin Ji, Yanyong Zhang

In this work, we present MM-Gaussian, a LiDAR-camera multi-modal fusion system for localization and mapping in unbounded scenes.

Autonomous Vehicles

CORP: A Multi-Modal Dataset for Campus-Oriented Roadside Perception Tasks

no code implementations4 Apr 2024 Beibei Wang, Shuang Meng, Lu Zhang, Chenjie Wang, Jingjing Huang, Yao Li, Haojie Ren, Yuxuan Xiao, Yuru Peng, Jianmin Ji, Yu Zhang, Yanyong Zhang

Numerous roadside perception datasets have been introduced to propel advancements in autonomous driving and intelligent transportation systems research and development.

Autonomous Driving Instance Segmentation +1

HVDistill: Transferring Knowledge from Images to Point Clouds via Unsupervised Hybrid-View Distillation

1 code implementation18 Mar 2024 Sha Zhang, Jiajun Deng, Lei Bai, Houqiang Li, Wanli Ouyang, Yanyong Zhang

We present a hybrid-view-based knowledge distillation framework, termed HVDistill, to guide the feature learning of a point cloud neural network with a pre-trained image network in an unsupervised man- ner.

Knowledge Distillation NER +1

Agent3D-Zero: An Agent for Zero-shot 3D Understanding

no code implementations18 Mar 2024 Sha Zhang, Di Huang, Jiajun Deng, Shixiang Tang, Wanli Ouyang, Tong He, Yanyong Zhang

The ability to understand and reason the 3D real world is a crucial milestone towards artificial general intelligence.

Language Modelling Scene Understanding

PoIFusion: Multi-Modal 3D Object Detection via Fusion at Points of Interest

no code implementations14 Mar 2024 Jiajun Deng, Sha Zhang, Feras Dayoub, Wanli Ouyang, Yanyong Zhang, Ian Reid

In particular, our PoIFusion follows the paradigm of query-based object detection, formulating object queries as dynamic 3D boxes and generating a set of PoIs based on each query box.

3D Object Detection Object +1

DGR: A General Graph Desmoothing Framework for Recommendation via Global and Local Perspectives

no code implementations7 Mar 2024 Leilei Ding, Dazhong Shen, Chao Wang, Tianfu Wang, Le Zhang, Yanyong Zhang

Graph Convolutional Networks (GCNs) have become pivotal in recommendation systems for learning user and item embeddings by leveraging the user-item interaction graph's node information and topology.

Recommendation Systems

EdgeCalib: Multi-Frame Weighted Edge Features for Automatic Targetless LiDAR-Camera Calibration

1 code implementation25 Oct 2023 Xingchen Li, Yifan Duan, Beibei Wang, Haojie Ren, Guoliang You, Yu Sheng, Jianmin Ji, Yanyong Zhang

The edge features, which are prevalent in various environments, are aligned in both images and point clouds to determine the extrinsic parameters.

Camera Calibration

UniBrain: Universal Brain MRI Diagnosis with Hierarchical Knowledge-enhanced Pre-training

1 code implementation13 Sep 2023 Jiayu Lei, Lisong Dai, Haoyun Jiang, Chaoyi Wu, Xiaoman Zhang, Yao Zhang, Jiangchao Yao, Weidi Xie, Yanyong Zhang, Yuehua Li, Ya zhang, Yanfeng Wang

Magnetic resonance imaging~(MRI) have played a crucial role in brain disease diagnosis, with which a range of computer-aided artificial intelligence methods have been proposed.

Diagnostic

Bi-LRFusion: Bi-Directional LiDAR-Radar Fusion for 3D Dynamic Object Detection

1 code implementation CVPR 2023 Yingjie Wang, Jiajun Deng, Yao Li, Jinshui Hu, Cong Liu, Yu Zhang, Jianmin Ji, Wanli Ouyang, Yanyong Zhang

LiDAR and Radar are two complementary sensing approaches in that LiDAR specializes in capturing an object's 3D shape while Radar provides longer detection ranges as well as velocity hints.

object-detection Object Detection

USTC FLICAR: A Sensors Fusion Dataset of LiDAR-Inertial-Camera for Heavy-duty Autonomous Aerial Work Robots

no code implementations4 Apr 2023 ZiMing Wang, Yujiang Liu, Yifan Duan, Xingchen Li, Xinran Zhang, Jianmin Ji, Erbao Dong, Yanyong Zhang

In this paper, we present the USTC FLICAR Dataset, which is dedicated to the development of simultaneous localization and mapping and precise 3D reconstruction of the workspace for heavy-duty autonomous aerial work robots.

3D Reconstruction Autonomous Driving +2

$P^{3}O$: Transferring Visual Representations for Reinforcement Learning via Prompting

no code implementations22 Mar 2023 Guoliang You, Xiaomeng Chu, Yifan Duan, Jie Peng, Jianmin Ji, Yu Zhang, Yanyong Zhang

In particular, we specify a prompt-transformer for representation conversion and propose a two-step training process to train the prompt-transformer for the target environment, while the rest of the DRL pipeline remains unchanged.

Deep Reinforcement Learning reinforcement-learning

Deep Reinforcement Learning for Localizability-Enhanced Navigation in Dynamic Human Environments

no code implementations22 Mar 2023 Yuan Chen, Quecheng Qiu, Xiangyu Liu, Guangda Chen, Shunyi Yao, Jie Peng, Jianmin Ji, Yanyong Zhang

The planner learns to assign different importance to the geometric features and encourages the robot to navigate through areas that are helpful for laser localization.

Deep Reinforcement Learning Navigate +1

TrajMatch: Towards Automatic Spatio-temporal Calibration for Roadside LiDARs through Trajectory Matching

no code implementations4 Feb 2023 Haojie Ren, Sha Zhang, Sugang Li, Yao Li, Xinchen Li, Jianmin Ji, Yu Zhang, Yanyong Zhang

In this paper, we propose TrajMatch -- the first system that can automatically calibrate for roadside LiDARs in both time and space.

OA-BEV: Bringing Object Awareness to Bird's-Eye-View Representation for Multi-Camera 3D Object Detection

no code implementations13 Jan 2023 Xiaomeng Chu, Jiajun Deng, Yuan Zhao, Jianmin Ji, Yu Zhang, Houqiang Li, Yanyong Zhang

To this end, we propose OA-BEV, a network that can be plugged into the BEV-based 3D object detection framework to bring out the objects by incorporating object-aware pseudo-3D features and depth features.

3D Object Detection Object +1

TLP: A Deep Learning-based Cost Model for Tensor Program Tuning

1 code implementation7 Nov 2022 Yi Zhai, Yu Zhang, Shuo Liu, Xiaomeng Chu, Jie Peng, Jianmin Ji, Yanyong Zhang

Instead of extracting features from the tensor program itself, TLP extracts features from the schedule primitives.

Multi-Task Learning

TransVG++: End-to-End Visual Grounding with Language Conditioned Vision Transformer

1 code implementation14 Jun 2022 Jiajun Deng, Zhengyuan Yang, Daqing Liu, Tianlang Chen, Wengang Zhou, Yanyong Zhang, Houqiang Li, Wanli Ouyang

For another, we devise Language Conditioned Vision Transformer that removes external fusion modules and reuses the uni-modal ViT for vision-language fusion at the intermediate layers.

Visual Grounding

VPFNet: Improving 3D Object Detection with Virtual Point based LiDAR and Stereo Data Fusion

no code implementations29 Nov 2021 Hanqi Zhu, Jiajun Deng, Yu Zhang, Jianmin Ji, Qiuyu Mao, Houqiang Li, Yanyong Zhang

However, this approach often suffers from the mismatch between the resolution of point clouds and RGB images, leading to sub-optimal performance.

3D Object Detection Data Augmentation +2

Reinforcement Learning for Robot Navigation with Adaptive Forward Simulation Time (AFST) in a Semi-Markov Model

1 code implementation13 Aug 2021 Yu'an Chen, Ruosong Ye, Ziyang Tao, Hongjian Liu, Guangda Chen, Jie Peng, Jun Ma, Yu Zhang, Jianmin Ji, Yanyong Zhang

Deep reinforcement learning (DRL) algorithms have proven effective in robot navigation, especially in unknown environments, by directly mapping perception inputs into robot control commands.

Deep Reinforcement Learning reinforcement-learning +2

From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object Detection

1 code implementation30 Jul 2021 Jiajun Deng, Wengang Zhou, Yanyong Zhang, Houqiang Li

To this end, in this work, we regard point clouds as hollow-3D data and propose a new architecture, namely Hallucinated Hollow-3D R-CNN ($\text{H}^2$3D R-CNN), to address the problem of 3D object detection.

3D Object Detection object-detection +1

Neighbor-Vote: Improving Monocular 3D Object Detection through Neighbor Distance Voting

1 code implementation6 Jul 2021 Xiaomeng Chu, Jiajun Deng, Yao Li, Zhenxun Yuan, Yanyong Zhang, Jianmin Ji, Yu Zhang

As cameras are increasingly deployed in new application domains such as autonomous driving, performing 3D object detection on monocular images becomes an important task for visual scene understanding.

Autonomous Driving Monocular 3D Object Detection +4

Multi-Modal 3D Object Detection in Autonomous Driving: a Survey

no code implementations24 Jun 2021 Yingjie Wang, Qiuyu Mao, Hanqi Zhu, Jiajun Deng, Yu Zhang, Jianmin Ji, Houqiang Li, Yanyong Zhang

In this survey, we first introduce the background of popular sensors used for self-driving, their data properties, and the corresponding object detection algorithms.

3D Object Detection Autonomous Driving +4

Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection

5 code implementations31 Dec 2020 Jiajun Deng, Shaoshuai Shi, Peiwei Li, Wengang Zhou, Yanyong Zhang, Houqiang Li

In this paper, we take a slightly different viewpoint -- we find that precise positioning of raw points is not essential for high performance 3D object detection and that the coarse voxel granularity can also offer sufficient detection accuracy.

3D Object Detection object-detection +2

Two-dimensional Anti-jamming Mobile Communication Based on Reinforcement Learning

no code implementations19 Dec 2017 Liang Xiao, Guoan Han, Donghua Jiang, Hongzi Zhu, Yanyong Zhang, H. Vincent Poor

It is shown that, by applying reinforcement learning techniques, a mobile device can achieve an optimal communication policy without the need to know the jamming and interference model and the radio channel model in a dynamic game framework.

reinforcement-learning Reinforcement Learning +2

Cannot find the paper you are looking for? You can Submit a new open access paper.