Search Results for author: Zhiyong Li

Found 24 papers, 20 papers with code

Out-of-Distribution Semantic Occupancy Prediction

1 code implementation26 Jun 2025 Yuheng Zhang, Mengfei Duan, Kunyu Peng, Yuhang Wang, Ruiping Liu, Fei Teng, Kai Luo, Zhiyong Li, Kailun Yang

We introduce OccOoD, a novel framework integrating OoD detection into 3D semantic occupancy prediction, with Voxel-BEV Progressive Fusion (VBPF) leveraging an RWKV-based branch to enhance OoD detection via geometry-semantic fusion.

3D Semantic Occupancy Prediction Autonomous Driving +1

Panoramic Out-of-Distribution Segmentation

1 code implementation6 May 2025 Mengfei Duan, Kailun Yang, Yuheng Zhang, Yihong Cao, Fei Teng, Kai Luo, Jiaming Zhang, Zhiyong Li, Shutao Li

To address these issues, we introduce a new task, Panoramic Out-of-distribution Segmentation (PanOoS), achieving OoS for panoramas.

Disentanglement Domain Generalization +3

HierDAMap: Towards Universal Domain Adaptive BEV Mapping via Hierarchical Perspective Priors

1 code implementation10 Mar 2025 Siyu Li, Yihong Cao, Hao Shi, Yongsheng Zang, Xuan He, Kailun Yang, Zhiyong Li

However, research on unsupervised domain adaptation for BEV mapping remains limited and cannot perfectly accommodate all BEV mapping tasks.

Autonomous Driving Semantic Segmentation +1

TS-CGNet: Temporal-Spatial Fusion Meets Centerline-Guided Diffusion for BEV Mapping

1 code implementation4 Mar 2025 Xinying Hong, Siyu Li, Kang Zeng, Hao Shi, Bomin Peng, Kailun Yang, Zhiyong Li

Specifically, this framework is decoupled into three parts: Local mapping system involves the initial generation of semantic maps using purely visual information; The Temporal-Spatial Aligner Module (TSAM) integrates historical information into mapping generation by applying transformation matrices; The Centerline-Guided Diffusion Model (CGDM) is a prediction module based on the diffusion model.

Autonomous Driving Semantic Segmentation

One-Shot Affordance Grounding of Deformable Objects in Egocentric Organizing Scenes

1 code implementation3 Mar 2025 Wanjun Jia, Fan Yang, Mengfei Duan, Xianchi Chen, Yinxi Wang, Yiming Jiang, Wenrui Chen, Kailun Yang, Zhiyong Li

Deformable object manipulation in robotics presents significant challenges due to uncertainties in component properties, diverse configurations, visual interference, and ambiguous prompts.

Deformable Object Manipulation

Multi-Keypoint Affordance Representation for Functional Dexterous Grasping

1 code implementation27 Feb 2025 Fan Yang, Dongsheng Luo, Wenrui Chen, Jiacheng Lin, Junjie Cai, Kailun Yang, Zhiyong Li, Yaonan Wang

Additionally, we present a Keypoint-based Grasp matrix Transformation (KGT) method, ensuring spatial consistency between hand keypoints and object contact points, thus providing a direct link between visual perception and dexterous grasping actions.

Learning Granularity-Aware Affordances from Human-Object Interaction for Tool-Based Functional Grasping in Dexterous Robotics

1 code implementation30 Jun 2024 Fan Yang, Wenrui Chen, Kailun Yang, Haoran Lin, Dongsheng Luo, Conghui Tang, Zhiyong Li, Yaonan Wang

To address this, we propose a granularity-aware affordance feature extraction method for locating functional affordance areas and predicting dexterous coarse gestures.

Human-Object Interaction Detection Object

DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map Construction

1 code implementation9 May 2024 Siyu Li, Jiacheng Lin, Hao Shi, Jiaming Zhang, Song Wang, You Yao, Zhiyong Li, Kailun Yang

In this paper, we revisit the temporal fusion of vectorized HD maps, focusing on temporal instance consistency and temporal map consistency learning.

Contrastive Learning Scene Understanding +1

MambaMOS: LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model

1 code implementation19 Apr 2024 Kang Zeng, Hao Shi, Jiacheng Lin, Siyu Li, Jintao Cheng, Kaiwei Wang, Zhiyong Li, Kailun Yang

In this paper, we propose a novel LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model, termed MambaMOS.

Object Semantic Segmentation

EchoTrack: Auditory Referring Multi-Object Tracking for Autonomous Driving

2 code implementations28 Feb 2024 Jiacheng Lin, Jiajun Chen, Kunyu Peng, Xuan He, Zhiyong Li, Rainer Stiefelhagen, Kailun Yang

This paper introduces the task of Auditory Referring Multi-Object Tracking (AR-MOT), which dynamically tracks specific objects in a video sequence based on audio expressions and appears as a challenging problem in autonomous driving.

Autonomous Driving Object +1

S$^3$-MonoDETR: Supervised Shape&Scale-perceptive Deformable Transformer for Monocular 3D Object Detection

no code implementations2 Sep 2023 Xuan He, Jin Yuan, Kailun Yang, Zhenchao Zeng, Zhiyong Li

These methods typically use visual and depth representations to generate query points on objects, whose quality plays a decisive role in the detection accuracy.

Monocular 3D Object Detection object-detection

Contrast-augmented Diffusion Model with Fine-grained Sequence Alignment for Markup-to-Image Generation

1 code implementation2 Aug 2023 Guojin Zhong, Jin Yuan, Pan Wang, Kailun Yang, Weili Guan, Zhiyong Li

The recently rising markup-to-image generation poses greater challenges as compared to natural image generation, due to its low tolerance for errors as well as the complex sequence and context correlations between markup and rendered image.

cross-modal alignment Denoising +1

PVPUFormer: Probabilistic Visual Prompt Unified Transformer for Interactive Image Segmentation

2 code implementations11 Jun 2023 Xu Zhang, Kailun Yang, Jiacheng Lin, Jin Yuan, Zhiyong Li, Shutao Li

To tackle this problem, this paper proposes a simple yet effective Probabilistic Visual Prompt Unified Transformer (PVPUFormer) for interactive image segmentation, which allows users to flexibly input diverse visual prompts with the probabilistic prompt encoding and feature post-processing to excavate sufficient and robust prompt features for performance boosting.

Image Segmentation Interactive Segmentation +2

SSD-MonoDETR: Supervised Scale-aware Deformable Transformer for Monocular 3D Object Detection

1 code implementation12 May 2023 Xuan He, Fan Yang, Kailun Yang, Jiacheng Lin, Haolong Fu, Meng Wang, Jin Yuan, Zhiyong Li

To tackle this problem, this paper proposes a novel "Supervised Scale-aware Deformable Attention" (SSDA) for monocular 3D object detection.

Monocular 3D Object Detection Object +1

Bi-Mapper: Holistic BEV Semantic Mapping for Autonomous Driving

1 code implementation7 May 2023 Siyu Li, Kailun Yang, Hao Shi, Jiaming Zhang, Jiacheng Lin, Zhifeng Teng, Zhiyong Li

At the same time, an Across-Space Loss (ASL) is designed to mitigate the negative impact of geometric distortions.

Autonomous Driving

Energy-efficient Dense DNN Acceleration with Signed Bit-slice Architecture

no code implementations15 Mar 2022 Dongseok Im, Gwangtae Park, Zhiyong Li, Junha Ryu, Hoi-jun Yoo

This paper proposes energy-efficient signed bit-slice architecture which accelerates both high-precision and dense DNNs by exploiting a large number of zero values of signed bit-slices.

Cannot find the paper you are looking for? You can Submit a new open access paper.