1 code implementation • 20 Mar 2024 • Tongtian Yue, Jie Cheng, Longteng Guo, Xingyuan Dai, Zijia Zhao, Xingjian He, Gang Xiong, Yisheng Lv, Jing Liu
In this paper, we present and delve into the self-consistency capability of LVLMs, a crucial aspect that reflects the models' ability to both generate informative captions for specific objects and subsequently utilize these captions to accurately re-identify the objects in a closed-loop process.
2 code implementations • 8 Mar 2024 • Chengyang Zhang, Yong Zhang, Qitan Shao, Jiangtao Feng, Bo Li, Yisheng Lv, Xinglin Piao, BaoCai Yin
The key challenge of the TTG task is how to associate text with the spatial structure of the road network and traffic data for generating traffic situations.
no code implementations • 27 Feb 2024 • Jie Cheng, Gang Xiong, Xingyuan Dai, Qinghai Miao, Yisheng Lv, Fei-Yue Wang
Our experiments on robotic manipulation and locomotion tasks demonstrate that RIME significantly enhances the robustness of the current state-of-the-art PbRL method.
1 code implementation • 23 Jan 2024 • Zhishuai Li, Yunhao Nie, Ziyue Li, Lei Bai, Yisheng Lv, Rui Zhao
As a pre-trained paradigm, we conduct the Kriging task from a new perspective of representation: we aim to first learn robust and general representations and then recover attributes from representations.
1 code implementation • 27 Nov 2023 • Chengyang Zhang, Yong Zhang, Qitan Shao, Bo Li, Yisheng Lv, Xinglin Piao, BaoCai Yin
The key challenge of the TTG task is how to associate text with the spatial structure of the road network and traffic data for generating traffic situations.
no code implementations • 11 Oct 2023 • Jingxiang Qu, Ryan Wen Liu, Chenjie Zhao, Yu Guo, Sendren Sheng-Dong Xu, Fenghua Zhu, Yisheng Lv
The accurate and efficient vessel draft reading (VDR) is an important component of intelligent maritime surveillance, which could be exploited to assist in judging whether the vessel is normally loaded or overloaded.
1 code implementation • 8 Mar 2023 • Yahui Liu, Bin Tian, Yisheng Lv, Lingxi Li, FeiYue Wang
To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called PointConT for short.
Ranked #11 on 3D Point Cloud Classification on ScanObjectNN
2 code implementations • 22 Feb 2023 • Yu Guo, Ryan Wen Liu, Jingxiang Qu, Yuxu Lu, Fenghua Zhu, Yisheng Lv
To further improve vessel traffic surveillance, it becomes necessary to fuse the AIS and video data to simultaneously capture the visual features, identity and dynamic information for the vessels of interest.
1 code implementation • 30 Nov 2022 • Siqi Fan, Fenghua Zhu, Zunlei Feng, Yisheng Lv, Mingli Song, Fei-Yue Wang
Pseudo supervision is regarded as the core idea in semi-supervised learning for semantic segmentation, and there is always a tradeoff between utilizing only the high-quality pseudo labels and leveraging all the pseudo labels.
no code implementations • 9 Nov 2022 • Talha Azfar, Jinlong Li, Hongkai Yu, Ruey Long Cheu, Yisheng Lv, Ruimin Ke
This paper conducted an extensive literature review on the applications of computer vision in ITS and AD, and discusses challenges related to data, models, and complex urban environments.
no code implementations • IEEE Transactions on Intelligent Transportation Systems 2022 • Junchen Jin, Member, IEEE, Dingding Rong, Tong Zhang, Qingyuan Ji, Haifeng Guo, Yisheng Lv, Xiaoliang Ma, and Fei-Yue Wang
This paper proposes a short-term traffic speed prediction approach, called PL-WGAN, for urban road networks, which is considered an important part of a novel parallel learning framework for traffic control and operation.
1 code implementation • CVPR 2021 • Siqi Fan, Qiulei Dong, Fenghua Zhu, Yisheng Lv, Peijun Ye, Fei-Yue Wang
For each 3D point, the local polar representation block is firstly explored to construct a spatial representation that is invariant to the z-axis rotation, then the dual-distance attentive pooling block is designed to utilize the representations of its neighbors for learning more discriminative local features according to both the geometric and feature distances among them, and finally, the global contextual feature block is designed to learn a global context for each 3D point by utilizing its spatial location and the volume ratio of the neighborhood to the global point cloud.
Ranked #1 on Semantic Segmentation on Toronto-3D L002
1 code implementation • IEEE Transactions on Vehicular Technology 2021 • Siqi Fan, Fenghua Zhu, Shichao Chen, HUI ZHANG, Bin Tian, Yisheng Lv, Fei-Yue Wang
Most successful object detectors are anchor-based, which is difficult to adapt to the diversity of traffic objects.