1 code implementation • 17 Dec 2024 • Wenbin An, Haonan Lin, Jiahao Nie, Feng Tian, Wenkai Shi, Yaqiang Wu, Qianying Wang, Ping Chen
The primary challenges stem from model bias induced by pre-training on only known categories and the lack of precise supervision for novel ones, leading to category bias towards known categories and category confusion among different novel categories, which hinders models' ability to identify novel categories effectively.
no code implementations • 5 Aug 2024 • Yuxuan Lu, Jiahao Nie, Zhiwei He, Hongjie Gu, Xudong Lv
Current LiDAR point cloud-based 3D single object tracking (SOT) methods typically rely on point-based representation network.
1 code implementation • 22 Jul 2024 • Wenbin An, Feng Tian, Jiahao Nie, Wenkai Shi, Haonan Lin, Yan Chen, Qianying Wang, Yaqiang Wu, Guang Dai, Ping Chen
Knowledge-based Visual Question Answering (KVQA) requires both image and world knowledge to answer questions.
1 code implementation • 7 Jul 2024 • Jiahao Nie, Fei Xie, Sifan Zhou, Xueyi Zhou, Dong-Kyu Chae, Zhiwei He
Moreover, under the same point-based representation, P2P-point outperforms the previous motion tracker M$^2$Track by \textbf{3. 3\%} and \textbf{6. 7\%} on the KITTI and NuScenes, while running at a considerably high speed of \textbf{107 Fps} on a single RTX3090 GPU.
1 code implementation • 27 Jun 2024 • Yicheng Xu, Yuxin Chen, Jiahao Nie, Yusong Wang, Huiping Zhuang, Manabu Okumura
In this setting, a CL learner is required to incrementally learn from multiple domains and classify test images from both seen and unseen domains without any domain-identity hint.
1 code implementation • 18 Jun 2024 • Wenbin An, Feng Tian, Sicong Leng, Jiahao Nie, Haonan Lin, Qianying Wang, Guang Dai, Ping Chen, Shijian Lu
To this end, we propose Assembly of Global and Local Attention (AGLA), a training-free and plug-and-play approach that mitigates object hallucinations by exploring an ensemble of global features for response generation and local features for visual discrimination simultaneously.
1 code implementation • 13 Jun 2024 • Jiahao Nie, Gongjie Zhang, Wenbin An, Yap-Peng Tan, Alex C. Kot, Shijian Lu
Though Multi-modal Large Language Models (MLLMs) have recently achieved significant progress, they often face various problems while handling inter-object relations, i. e., the interaction or association among distinct objects.
1 code implementation • 15 May 2024 • Jiahao Nie, Shan Lin, Alex C. Kot
The primary color profile of the same identity is assumed to remain consistent in typical Person Re-identification (Person ReID) tasks.
no code implementations • 20 Jan 2024 • Jiahao Nie, Zhiwei He, Xudong Lv, Xueyi Zhou, Dong-Kyu Chae, Fei Xie
Based on this observation, we design a novel point set representation learning network inheriting transformer architecture, termed AdaFormer, which adaptively encodes the dynamically varying shape and size information from cross-category data in a unified manner.
1 code implementation • CVPR 2024 • Jiahao Nie, Yun Xing, Gongjie Zhang, Pei Yan, Aoran Xiao, Yap-Peng Tan, Alex C. Kot, Shijian Lu
Cross-Domain Few-Shot Segmentation (CD-FSS) poses the challenge of segmenting novel categories from a distinct domain using only limited exemplars.
2 code implementations • NeurIPS 2023 • Yun Xing, Jian Kang, Aoran Xiao, Jiahao Nie, Ling Shao, Shijian Lu
Such semantic misalignment circulates in pre-training, leading to inferior zero-shot performance in dense predictions due to insufficient visual concepts captured in textual representations.
2 code implementations • 23 Apr 2023 • Jiahao Nie, Zhiwei He, Yuxiang Yang, Zhengyi Bao, Mingyu Gao, Jing Zhang
By integrating the derived classification scores with the center-ness scores, the resulting network can effectively suppress interference proposals and further mitigate task misalignment.
1 code implementation • 1 Apr 2023 • Jiahao Nie, Zhiwei He, Yuxiang Yang, Xudong Lv, Mingyu Gao, Jing Zhang
Incorporating this transformer-based voting scheme into 3D RPN, a novel Siamese method dubbed GLT-T is developed for 3D single object tracking on point clouds.
2 code implementations • 20 Nov 2022 • Jiahao Nie, Zhiwei He, Yuxiang Yang, Mingyu Gao, Jing Zhang
Technically, a global-local transformer (GLT) module is employed to integrate object- and patch-aware prior into seed point features to effectively form strong feature representation for geometric positions of the seed points, thus providing more robust and accurate cues for offset learning.
no code implementations • 29 Apr 2022 • Jiahao Nie, Han Wu, Zhiwei He, Yuxiang Yang, Mingyu Gao, Zhekang Dong
In this paper, to alleviate this misalignment, we propose a novel tracking paradigm, called SiamLA.