1 code implementation • 5 Jun 2025 • Yani Zhang, Dongming Wu, Hao Shi, Yingfei Liu, Tiancai Wang, Haoqiang Fan, Xingping Dong
In this study, we explore a fundamental question: Does embodied 3D grounding benefit enough from detection?
no code implementations • 14 Mar 2025 • Shaofeng Liang, Runwei Guan, Wangwang Lian, Daizong Liu, Xiaolou Sun, Dongming Wu, Yutao Yue, Weiping Ding, Hui Xiong
It then disentangles language descriptions and hierarchically injects them into object queries, refining object understanding from coarse to fine-grained semantic levels.
1 code implementation • CVPR 2025 • Tianyi Yan, Dongming Wu, Wencheng Han, Junpeng Jiang, Xia Zhou, Kun Zhan, Cheng-Zhong Xu, Jianbing Shen
By providing a dynamic and realistic simulation environment, DrivingSphere enables comprehensive testing and validation of autonomous driving algorithms, ultimately advancing the development of more reliable autonomous cars.
1 code implementation • 7 Jun 2024 • Yani Zhang, Dongming Wu, Wencheng Han, Xingping Dong
Referring multi-object tracking (RMOT) aims at detecting and tracking multiple objects following human instruction represented by a natural language expression.
no code implementations • 28 May 2024 • Yifan Bai, Dongming Wu, Yingfei Liu, Fan Jia, Weixin Mao, Ziheng Zhang, Yucheng Zhao, Jianbing Shen, Xing Wei, Tiancai Wang, Xiangyu Zhang
Despite its simplicity, Atlas demonstrates superior performance in both 3D detection and ego planning tasks on nuScenes dataset, proving that 3D-tokenized LLM is the key to reliable autonomous driving.
no code implementations • 30 Nov 2023 • En Yu, Liang Zhao, Yana Wei, Jinrong Yang, Dongming Wu, Lingyu Kong, Haoran Wei, Tiancai Wang, Zheng Ge, Xiangyu Zhang, Wenbing Tao
Then, FIT requires MLLMs to first predict trajectories of related objects and then reason about potential future events based on them.
Ranked #164 on
Visual Question Answering
on MM-Vet
1 code implementation • 10 Oct 2023 • Dongming Wu, Jiahao Chang, Fan Jia, Yingfei Liu, Tiancai Wang, Jianbing Shen
Further, we propose TopoMLP, a simple yet high-performance pipeline for driving topology reasoning.
Ranked #5 on
3D Lane Detection
on OpenLane-V2 val
1 code implementation • 8 Sep 2023 • Dongming Wu, Wencheng Han, Yingfei Liu, Tiancai Wang, Cheng-Zhong Xu, Xiangyu Zhang, Jianbing Shen
Furthermore, we provide a simple end-to-end baseline model based on Transformer, named PromptTrack.
1 code implementation • ICCV 2023 • Dongming Wu, Tiancai Wang, Yuang Zhang, Xiangyu Zhang, Jianbing Shen
Referring video object segmentation (RVOS) aims at segmenting an object in a video following human instruction.
Referring Expression Segmentation
Referring Video Object Segmentation
+2
no code implementations • 20 Jun 2023 • Dongming Wu, Lulu Wen, Chao Chen, Zhaoshu Shi
To mitigate this problem, we propose a novel and simple counterfactual data augmentation method to generate opinion expressions with reversed sentiment polarity.
Aspect-Based Sentiment Analysis
Aspect-Based Sentiment Analysis (ABSA)
+5
1 code implementation • 16 Jun 2023 • Dongming Wu, Fan Jia, Jiahao Chang, Zhuoling Li, Jianjian Sun, Chunrui Han, Shuailin Li, Yingfei Liu, Zheng Ge, Tiancai Wang
We present the 1st-place solution of OpenLane Topology in Autonomous Driving Challenge.
1 code implementation • CVPR 2023 • Dongming Wu, Wencheng Han, Tiancai Wang, Xingping Dong, Xiangyu Zhang, Jianbing Shen
In this paper, we propose a new and general referring understanding task, termed referring multi-object tracking (RMOT).
no code implementations • CVPR 2022 • Dongming Wu, Xingping Dong, Ling Shao, Jianbing Shen
To address this, we propose a novel multi-level representation learning approach, which explores the inherent structure of the video content to provide a set of discriminative visual embedding, enabling more effective vision-language semantic alignment.
no code implementations • IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY 2021 • Dongming Wu, Mang Ye, Gaojie Lin, Xin Gao, Jianbing Shen
In addition, we propose a novel multi-head collaborative training scheme to improve the performance, which is collaboratively supervised by multiple heads with the same structure but different parameters.
no code implementations • 19 May 2017 • Xingping Dong, Jianbing Shen, Dongming Wu, Kan Guo, Xiaogang Jin, Fatih Porikli
In this paper, we propose a new quadruplet deep network to examine the potential connections among the training instances, aiming to achieve a more powerful representation.