no code implementations • 22 Oct 2024 • Yuheng Lu, Bingshuo Qian, Caixia Yuan, Huixing Jiang, Xiaojie Wang
Large language models (LLMs) exhibit remarkable capabilities in natural language processing but face catastrophic forgetting when learning new tasks, where adaptation to a new domain leads to a substantial decline in performance on previous tasks.
1 code implementation • CVPR 2023 • Yuheng Lu, Chenfeng Xu, Xiaobao Wei, Xiaodong Xie, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang
In this paper, we address open-vocabulary 3D point-cloud detection by a dividing-and-conquering strategy, which involves: 1) developing a point-cloud detector that can learn a general representation for localizing various objects, and 2) connecting textual and point-cloud representations to enable the detector to classify novel object categories based on text prompting.
1 code implementation • CVPR 2023 • Anthony Chen, Kevin Zhang, Renrui Zhang, Zihan Wang, Yuheng Lu, Yandong Guo, Shanghang Zhang
Masked Autoencoders learn strong visual representations and achieve state-of-the-art results in several independent modalities, yet very few works have addressed their capabilities in multi-modality settings.
no code implementations • 5 Jul 2022 • Yuheng Lu, Chenfeng Xu, Xiaobao Wei, Xiaodong Xie, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang
Current point-cloud detection methods have difficulty detecting the open-vocabulary objects in the real world, due to their limited generalization capability.
no code implementations • 22 Jan 2022 • Yi Hou, Chengyang Li, Yuheng Lu, Liping Zhu, Yuan Li, Huizhu Jia, Xiaodong Xie
In this article, we propose a simulated crowd counting dataset CrowdX, which has a large scale, accurate labeling, parameterized realization, and high fidelity.
no code implementations • 29 Sep 2021 • Yuheng Lu, Jinpeng Chen, Chuxiong Sun, Jie Hu
In this work, we propose a novel framework which follows the anchor-based idea and aims at conveying distance information implicitly along the MPNN message passing steps for encoding position information, node attributes, and graph structure in a more flexible way.
no code implementations • 21 Sep 2021 • Bojie Wang, Yuheng Lu
Specifically, our model use the graph neural network framework with powerful representation capabilities to represent the interaction between group-user-items in the topological structure of the graph, and at the same time, analyze the interaction pattern of the graph to adjust the feature output of the graph neural network, the feature representations of groups, and items are obtained to calculate the group's preference for items.
no code implementations • 9 May 2021 • Yuheng Lu, Jinpeng Chen, Chuxiong Sun, Jie Hu
We show that GIRs get outperformed results in position-aware scenarios, and performances on typical GNNs could be improved by fusing GIR embeddings.
no code implementations • 3 Aug 2020 • Yuheng Lu, Fan Yang, Fangping Chen, Don Xie
Place recognition is one of the hot research fields in automation technology and is still an open issue, Camera and Lidar are two mainstream sensors used in this task, Camera-based methods are easily affected by illumination and season changes, LIDAR cannot get the rich data as the image could , In this paper, we propose the PIC-Net (Point cloud and Image Collaboration Network), which use attention mechanism to fuse the features of image and point cloud, and mine the complementary information between the two.
Ranked #2 on
Visual Place Recognition
on Oxford RobotCar (LiDAR 4096 points+RGB)
(recall@top1% metric)
no code implementations • ICCV 2019 • Wei Luo, Xitong Yang, Xianjie Mo, Yuheng Lu, Larry S. Davis, Jun Li, Jian Yang, Ser-Nam Lim
Recognizing objects from subcategories with very subtle differences remains a challenging task due to the large intra-class and small inter-class variation.
Ranked #10 on
Fine-Grained Image Classification
on CUB-200-2011
Fine-Grained Image Classification
Fine-Grained Visual Categorization