1 code implementation • 14 Apr 2025 • Yating Liu, Yaowei Li, Xiangyuan Lan, Wenming Yang, Zimo Liu, Qingmin Liao
Text-based Person Retrieval (TPR) as a multi-modal task, which aims to retrieve the target person from a pool of candidate images given a text description, has recently garnered considerable attention due to the progress of contrastive visual-language pre-trained model.
1 code implementation • 30 Mar 2025 • Jiahao Li, Yiqiang Chen, Yunbing Xing, Yang Gu, Xiangyuan Lan
Unlearnable data (ULD) has emerged as an innovative defense technique to prevent machine learning models from learning meaningful patterns from specific data, thus protecting data privacy and security.
1 code implementation • 18 Mar 2025 • Shengping Zhang, Xiaoyu Han, Weigang Zhang, Xiangyuan Lan, Hongxun Yao, Qingming Huang
Finally, we introduce Limb-aware Texture Fusion (LTF) that focuses on generating realistic details in limb regions, where a coarse try-on result is first generated by fusing the warped clothing image with the person image, then limb textures are further fused with the coarse result under limb-aware guidance to refine limb details.
1 code implementation • 17 Mar 2025 • Songjun Tu, Jiahao Lin, Xiangyu Tian, Qichao Zhang, Linjing Li, Yuqian Fu, Nan Xu, wei he, Xiangyuan Lan, Dongmei Jiang, Dongbin Zhao
Recent advancements in post-training methodologies for large language models (LLMs) have highlighted reinforcement learning (RL) as a critical component for enhancing reasoning.
1 code implementation • 23 Feb 2025 • Feng Lu, Tong Jin, Xiangyuan Lan, Lijun Zhang, Yunpeng Liu, YaoWei Wang, Chun Yuan
In our previous work, we propose a novel method to realize seamless adaptation of foundation models to VPR (SelaVPR).
4 code implementations • 28 Dec 2024 • Linhui Xiao, Xiaoshan Yang, Xiangyuan Lan, YaoWei Wang, Changsheng Xu
Finally, we outline the challenges confronting visual grounding and propose valuable directions for future research, which may serve as inspiration for subsequent researchers.
1 code implementation • 22 Dec 2024 • Songjun Tu, Jingbo Sun, Qichao Zhang, Xiangyuan Lan, Dongbin Zhao
RL-SaLLM-F leverages the reflective and discriminative capabilities of LLM to generate self-augmented trajectories and provide preference labels for reward learning.
2 code implementations • 16 Dec 2024 • Wenyun Li, Zheng Zhang, Xiangyuan Lan, Dongmei Jiang
Extensive experiments on two high-resolution face recognition datasets validate that our TCA$^2$ method can generate natural text-guided adversarial impersonation faces with high transferability.
no code implementations • 1 Dec 2024 • Yan Li, Yifei Xing, Xiangyuan Lan, Xin Li, Haifeng Chen, Dongmei Jiang
Extensive experiments on complete and incomplete multimodal fusion tasks demonstrate the effectiveness and efficiency of the proposed method.
no code implementations • 20 Nov 2024 • Kuiran Wang, Xuehui Yu, Wenwen Yu, Guorong Li, Xiangyuan Lan, Qixiang Ye, Jianbin Jiao, Zhenjun Han
The bounding box will be used as input of single object trackers.
no code implementations • 8 Oct 2024 • Yifei Xing, Xiangyuan Lan, Ruiping Wang, Dongmei Jiang, Wenjun Huang, Qingfang Zheng, YaoWei Wang
In this work, we propose Empowering Multi-modal Mamba with Structural and Hierarchical Alignment (EMMA), which enables the MLLM to extract fine-grained visual information.
1 code implementation • 10 Jul 2024 • Hao Wang, Pengzhen Ren, Zequn Jie, Xiao Dong, Chengjian Feng, Yinlong Qian, Lin Ma, Dongmei Jiang, YaoWei Wang, Xiangyuan Lan, Xiaodan Liang
To address these challenges, we propose a novel unified open-vocabulary detection method called OV-DINO, which is pre-trained on diverse large-scale datasets with language-aware selective fusion in a unified framework.
Ranked #5 on
Zero-Shot Object Detection
on MSCOCO
(AP metric, using extra
training data)
1 code implementation • CVPR 2024 • Feng Lu, Xiangyuan Lan, Lijun Zhang, Dongmei Jiang, YaoWei Wang, Chun Yuan
Over the past decade, most methods in visual place recognition (VPR) have used neural networks to produce feature representations.
1 code implementation • 25 Feb 2024 • Feng Lu, Shuting Dong, Lijun Zhang, Bingxi Liu, Xiangyuan Lan, Dongmei Jiang, Chun Yuan
Moreover, we design a re-projection error of inliers loss to train the DHE network without additional homography labels, which can also be jointly trained with the backbone network to help it extract the features that are more suitable for local matching.
1 code implementation • 22 Feb 2024 • Feng Lu, Lijun Zhang, Xiangyuan Lan, Shuting Dong, YaoWei Wang, Chun Yuan
Experimental results show that our method outperforms the state-of-the-art methods with less training data and training time, and uses about only 3% retrieval runtime of the two-stage VPR methods with RANSAC-based spatial verification.
Ranked #2 on
Visual Place Recognition
on Pittsburgh-250k-test
1 code implementation • ICCV 2023 • Guiping Cao, Shengda Luo, Wenjian Huang, Xiangyuan Lan, Dongmei Jiang, YaoWei Wang, JianGuo Zhang
Finally, based on the Strip MLP layer, we propose a novel \textbf{L}ocal \textbf{S}trip \textbf{M}ixing \textbf{M}odule (LSMM) to boost the token interaction power in the local region.
1 code implementation • 3 Apr 2023 • Qinglin Liu, Xiaoqian Lv, Quanling Meng, Zonglin Li, Xiangyuan Lan, Shuo Yang, Shengping Zhang, Liqiang Nie
Furthermore, AEMatter leverages a large image training strategy to assist the network in learning context aggregation from data.
Ranked #1 on
Image Matting
on Composition-1K
no code implementations • 22 Mar 2021 • Chunzhi Yi, Feng Jiang, Shengping Zhang, Hao Guo, Chifu Yang, Zhen Ding, Baichun Wei, Xiangyuan Lan, Huiyu Zhou
Challenges of exoskeletons motor intent decoding schemes remain in making a continuous prediction to compensate for the hysteretic response caused by mechanical transmission.
no code implementations • 12 Jan 2021 • Xuanyu He, Wei zhang, Ran Song, Qian Zhang, Xiangyuan Lan, Lin Ma
By studying two unsupervised person re-ID methods in a cross-method way, we point out a hard negative problem is handled implicitly by their designs of data augmentations and PK sampler respectively.
1 code implementation • 25 Nov 2019 • Rui Shao, Xiangyuan Lan, Pong C. Yuen
Besides, to further enhance the generalization ability of our model, the proposed framework adopts a fine-grained learning strategy that simultaneously conducts meta-learning in a variety of domain shift scenarios in each iteration.
no code implementations • 6 Jun 2019 • Zheheng Jiang, Zhihua Liu, Long Chen, Lei Tong, Xiangrong Zhang, Xiangyuan Lan, Danny Crookes, Ming-Hsuan Yang, Huiyu Zhou
The study of mouse social behaviours has been increasingly undertaken in neuroscience research.
no code implementations • ECCV 2018 • Mang Ye, Xiangyuan Lan, Pong C. Yuen
After that, a robust and efficient top-k counts label prediction strategy is proposed to predict the labels of unlabeled image sequences.
Ranked #11 on
Person Re-Identification
on PRID2011
Representation Learning
Video-Based Person Re-Identification
no code implementations • ECCV 2018 • Si-Qi Liu, Xiangyuan Lan, Pong C. Yuen
3D mask face presentation attack, as a new challenge in face recognition, has been attracting increasing attention.
no code implementations • CVPR 2014 • Xiangyuan Lan, Andy J. Ma, Pong C. Yuen
The use of multiple features for tracking has been proved as an effective approach because limitation of each feature could be compensated.