1 code implementation • 26 Jun 2025 • Yuheng Zhang, Mengfei Duan, Kunyu Peng, Yuhang Wang, Ruiping Liu, Fei Teng, Kai Luo, Zhiyong Li, Kailun Yang
We introduce OccOoD, a novel framework integrating OoD detection into 3D semantic occupancy prediction, with Voxel-BEV Progressive Fusion (VBPF) leveraging an RWKV-based branch to enhance OoD detection via geometry-semantic fusion.
1 code implementation • 6 May 2025 • Mengfei Duan, Kailun Yang, Yuheng Zhang, Yihong Cao, Fei Teng, Kai Luo, Jiaming Zhang, Zhiyong Li, Shutao Li
To address these issues, we introduce a new task, Panoramic Out-of-distribution Segmentation (PanOoS), achieving OoS for panoramas.
1 code implementation • 10 Mar 2025 • Siyu Li, Yihong Cao, Hao Shi, Yongsheng Zang, Xuan He, Kailun Yang, Zhiyong Li
However, research on unsupervised domain adaptation for BEV mapping remains limited and cannot perfectly accommodate all BEV mapping tasks.
1 code implementation • 4 Mar 2025 • Yizhou Huang, Fan Yang, Guoliang Zhu, Gen Li, Hao Shi, Yukun Zuo, Wenrui Chen, Zhiyong Li, Kailun Yang
This information is rich and multimodal in nature.
1 code implementation • 4 Mar 2025 • Jiayi Zhao, Fei Teng, Kai Luo, Guoqiang Zhao, Zhiyong Li, Xu Zheng, Kailun Yang
The perception capability of robotic systems relies on the richness of the dataset.
Ranked #1 on
Thermal Image Segmentation
on PST900
1 code implementation • 4 Mar 2025 • Xinying Hong, Siyu Li, Kang Zeng, Hao Shi, Bomin Peng, Kailun Yang, Zhiyong Li
Specifically, this framework is decoupled into three parts: Local mapping system involves the initial generation of semantic maps using purely visual information; The Temporal-Spatial Aligner Module (TSAM) integrates historical information into mapping generation by applying transformation matrices; The Centerline-Guided Diffusion Model (CGDM) is a prediction module based on the diffusion model.
1 code implementation • 3 Mar 2025 • Wanjun Jia, Fan Yang, Mengfei Duan, Xianchi Chen, Yinxi Wang, Yiming Jiang, Wenrui Chen, Kailun Yang, Zhiyong Li
Deformable object manipulation in robotics presents significant challenges due to uncertainties in component properties, diverse configurations, visual interference, and ambiguous prompts.
1 code implementation • 27 Feb 2025 • Fan Yang, Dongsheng Luo, Wenrui Chen, Jiacheng Lin, Junjie Cai, Kailun Yang, Zhiyong Li, Yaonan Wang
Additionally, we present a Keypoint-based Grasp matrix Transformation (KGT) method, ensuring spatial consistency between hand keypoints and object contact points, thus providing a direct link between visual perception and dexterous grasping actions.
1 code implementation • 13 Sep 2024 • Siyu Li, Kailun Yang, Hao Shi, Song Wang, You Yao, Zhiyong Li
The framework is established with a triadic synergy architecture, including principal and dual auxiliary branches.
1 code implementation • 30 Jun 2024 • Fan Yang, Wenrui Chen, Kailun Yang, Haoran Lin, Dongsheng Luo, Conghui Tang, Zhiyong Li, Yaonan Wang
To address this, we propose a granularity-aware affordance feature extraction method for locating functional affordance areas and predicting dexterous coarse gestures.
1 code implementation • 9 May 2024 • Siyu Li, Jiacheng Lin, Hao Shi, Jiaming Zhang, Song Wang, You Yao, Zhiyong Li, Kailun Yang
In this paper, we revisit the temporal fusion of vectorized HD maps, focusing on temporal instance consistency and temporal map consistency learning.
1 code implementation • 19 Apr 2024 • Kang Zeng, Hao Shi, Jiacheng Lin, Siyu Li, Jintao Cheng, Kaiwei Wang, Zhiyong Li, Kailun Yang
In this paper, we propose a novel LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model, termed MambaMOS.
2 code implementations • 28 Feb 2024 • Jiacheng Lin, Jiajun Chen, Kunyu Peng, Xuan He, Zhiyong Li, Rainer Stiefelhagen, Kailun Yang
This paper introduces the task of Auditory Referring Multi-Object Tracking (AR-MOT), which dynamically tracks specific objects in a video sequence based on audio expressions and appears as a challenging problem in autonomous driving.
1 code implementation • 30 Jan 2024 • Fei Teng, Jiaming Zhang, Jiawei Liu, Kunyu Peng, Xina Cheng, Zhiyong Li, Kailun Yang
Research on inter-network data connectivity is scant.
1 code implementation • CVPR 2024 • Ruoxi Zhu, Shusong Xu, Peiye Liu, Sicheng Li, Yanheng Lu, Dimin Niu, Zihao Liu, Zihao Meng, Zhiyong Li, Xinhua Chen, Yibo Fan
However obtaining paired HDR and high-quality LDR images is difficult posing a challenge to deep learning based tone mapping methods.
no code implementations • 2 Sep 2023 • Xuan He, Jin Yuan, Kailun Yang, Zhenchao Zeng, Zhiyong Li
These methods typically use visual and depth representations to generate query points on objects, whose quality plays a decisive role in the detection accuracy.
no code implementations • 8 Aug 2023 • Jiajun Chen, Jiacheng Lin, Guojin Zhong, Haolong Fu, Ke Nai, Kailun Yang, Zhiyong Li
Next, we propose an Expression Alignment (EA) mechanism for audio and text.
1 code implementation • 2 Aug 2023 • Guojin Zhong, Jin Yuan, Pan Wang, Kailun Yang, Weili Guan, Zhiyong Li
The recently rising markup-to-image generation poses greater challenges as compared to natural image generation, due to its low tolerance for errors as well as the complex sequence and context correlations between markup and rendered image.
2 code implementations • 11 Jun 2023 • Xu Zhang, Kailun Yang, Jiacheng Lin, Jin Yuan, Zhiyong Li, Shutao Li
To tackle this problem, this paper proposes a simple yet effective Probabilistic Visual Prompt Unified Transformer (PVPUFormer) for interactive image segmentation, which allows users to flexibly input diverse visual prompts with the probabilistic prompt encoding and feature post-processing to excavate sufficient and robust prompt features for performance boosting.
1 code implementation • 12 May 2023 • Xuan He, Fan Yang, Kailun Yang, Jiacheng Lin, Haolong Fu, Meng Wang, Jin Yuan, Zhiyong Li
To tackle this problem, this paper proposes a novel "Supervised Scale-aware Deformable Attention" (SSDA) for monocular 3D object detection.
1 code implementation • 7 May 2023 • Jiacheng Lin, Jiajun Chen, Kailun Yang, Alina Roitberg, Siyu Li, Zhiyong Li, Shutao Li
Interactive Image Segmentation (IIS) has emerged as a promising technique for decreasing annotation time.
1 code implementation • 7 May 2023 • Siyu Li, Kailun Yang, Hao Shi, Jiaming Zhang, Jiacheng Lin, Zhifeng Teng, Zhiyong Li
At the same time, an Across-Space Loss (ASL) is designed to mitigate the negative impact of geometric distortions.
no code implementations • 15 Mar 2022 • Dongseok Im, Gwangtae Park, Zhiyong Li, Junha Ryu, Hoi-jun Yoo
This paper proposes energy-efficient signed bit-slice architecture which accelerates both high-precision and dense DNNs by exploiting a large number of zero values of signed bit-slices.
no code implementations • 18 Jan 2022 • Haidong Wang, Zhiyong Li, Yaping Li, Ke Nai, Ming Wen
The feature difference between current candidate detections and historical tracklets makes the object association much harder.