Search Results for author: Xiangyuan Lan

Found 24 papers, 15 papers with code

UP-Person: Unified Parameter-Efficient Transfer Learning for Text-based Person Retrieval

1 code implementation14 Apr 2025 Yating Liu, Yaowei Li, Xiangyuan Lan, Wenming Yang, Zimo Liu, Qingmin Liao

Text-based Person Retrieval (TPR) as a multi-modal task, which aims to retrieve the target person from a pool of candidate images given a text description, has recently garnered considerable attention due to the progress of contrastive visual-language pre-trained model.

Person Retrieval Retrieval +3

A Survey on Unlearnable Data

1 code implementation30 Mar 2025 Jiahao Li, Yiqiang Chen, Yunbing Xing, Yang Gu, Xiangyuan Lan

Unlearnable data (ULD) has emerged as an innovative defense technique to prevent machine learning models from learning meaningful patterns from specific data, thus protecting data privacy and security.

Machine Unlearning Survey

Limb-Aware Virtual Try-On Network with Progressive Clothing Warping

1 code implementation18 Mar 2025 Shengping Zhang, Xiaoyu Han, Weigang Zhang, Xiangyuan Lan, Hongxun Yao, Qingming Huang

Finally, we introduce Limb-aware Texture Fusion (LTF) that focuses on generating realistic details in limb regions, where a coarse try-on result is first generated by fusing the warped clothing image with the person image, then limb textures are further fused with the coarse result under limb-aware guidance to refine limb details.

Virtual Try-on

Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation

1 code implementation17 Mar 2025 Songjun Tu, Jiahao Lin, Xiangyu Tian, Qichao Zhang, Linjing Li, Yuqian Fu, Nan Xu, wei he, Xiangyuan Lan, Dongmei Jiang, Dongbin Zhao

Recent advancements in post-training methodologies for large language models (LLMs) have highlighted reinforcement learning (RL) as a critical component for enhancing reasoning.

Mathematical Reasoning Reinforcement Learning (RL)

SelaVPR++: Towards Seamless Adaptation of Foundation Models for Efficient Place Recognition

1 code implementation23 Feb 2025 Feng Lu, Tong Jin, Xiangyuan Lan, Lijun Zhang, Yunpeng Liu, YaoWei Wang, Chun Yuan

In our previous work, we propose a novel method to realize seamless adaptation of foundation models to VPR (SelaVPR).

Deep Hashing Re-Ranking +1

Towards Visual Grounding: A Survey

4 code implementations28 Dec 2024 Linhui Xiao, Xiaoshan Yang, Xiangyuan Lan, YaoWei Wang, Changsheng Xu

Finally, we outline the challenges confronting visual grounding and propose valuable directions for future research, which may serve as inspiration for subsequent researchers.

Phrase Grounding Referring Expression +3

Online Preference-based Reinforcement Learning with Self-augmented Feedback from Large Language Model

1 code implementation22 Dec 2024 Songjun Tu, Jingbo Sun, Qichao Zhang, Xiangyuan Lan, Dongbin Zhao

RL-SaLLM-F leverages the reflective and discriminative capabilities of LLM to generate self-augmented trajectories and provide preference labels for reward learning.

Language Modeling Language Modelling +1

Transferable Adversarial Face Attack with Text Controlled Attribute

2 code implementations16 Dec 2024 Wenyun Li, Zheng Zhang, Xiangyuan Lan, Dongmei Jiang

Extensive experiments on two high-resolution face recognition datasets validate that our TCA$^2$ method can generate natural text-guided adversarial impersonation faces with high transferability.

Attribute Face Recognition

AlignMamba: Enhancing Multimodal Mamba with Local and Global Cross-modal Alignment

no code implementations1 Dec 2024 Yan Li, Yifei Xing, Xiangyuan Lan, Xin Li, Haifeng Chen, Dongmei Jiang

Extensive experiments on complete and incomplete multimodal fusion tasks demonstrate the effectiveness and efficiency of the proposed method.

cross-modal alignment Mamba

EMMA: Empowering Multi-modal Mamba with Structural and Hierarchical Alignment

no code implementations8 Oct 2024 Yifei Xing, Xiangyuan Lan, Ruiping Wang, Dongmei Jiang, Wenjun Huang, Qingfang Zheng, YaoWei Wang

In this work, we propose Empowering Multi-modal Mamba with Structural and Hierarchical Alignment (EMMA), which enables the MLLM to extract fine-grained visual information.

cross-modal alignment Hallucination +1

OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion

1 code implementation10 Jul 2024 Hao Wang, Pengzhen Ren, Zequn Jie, Xiao Dong, Chengjian Feng, Yinlong Qian, Lin Ma, Dongmei Jiang, YaoWei Wang, Xiangyuan Lan, Xiaodan Liang

To address these challenges, we propose a novel unified open-vocabulary detection method called OV-DINO, which is pre-trained on diverse large-scale datasets with language-aware selective fusion in a unified framework.

Ranked #5 on Zero-Shot Object Detection on MSCOCO (AP metric, using extra training data)

Zero-Shot Object Detection

Deep Homography Estimation for Visual Place Recognition

1 code implementation25 Feb 2024 Feng Lu, Shuting Dong, Lijun Zhang, Bingxi Liu, Xiangyuan Lan, Dongmei Jiang, Chun Yuan

Moreover, we design a re-projection error of inliers loss to train the DHE network without additional homography labels, which can also be jointly trained with the backbone network to help it extract the features that are more suitable for local matching.

Homography Estimation Re-Ranking +1

Towards Seamless Adaptation of Pre-trained Models for Visual Place Recognition

1 code implementation22 Feb 2024 Feng Lu, Lijun Zhang, Xiangyuan Lan, Shuting Dong, YaoWei Wang, Chun Yuan

Experimental results show that our method outperforms the state-of-the-art methods with less training data and training time, and uses about only 3% retrieval runtime of the two-stage VPR methods with RANSAC-based spatial verification.

Re-Ranking Visual Place Recognition

Strip-MLP: Efficient Token Interaction for Vision MLP

1 code implementation ICCV 2023 Guiping Cao, Shengda Luo, Wenjian Huang, Xiangyuan Lan, Dongmei Jiang, YaoWei Wang, JianGuo Zhang

Finally, based on the Strip MLP layer, we propose a novel \textbf{L}ocal \textbf{S}trip \textbf{M}ixing \textbf{M}odule (LSMM) to boost the token interaction power in the local region.

Revisiting Context Aggregation for Image Matting

1 code implementation3 Apr 2023 Qinglin Liu, Xiaoqian Lv, Quanling Meng, Zonglin Li, Xiangyuan Lan, Shuo Yang, Shengping Zhang, Liqiang Nie

Furthermore, AEMatter leverages a large image training strategy to assist the network in learning context aggregation from data.

Decoder Image Matting

Continuous Prediction of Lower-Limb Kinematics From Multi-Modal Biomedical Signals

no code implementations22 Mar 2021 Chunzhi Yi, Feng Jiang, Shengping Zhang, Hao Guo, Chifu Yang, Zhen Ding, Baichun Wei, Xiangyuan Lan, Huiyu Zhou

Challenges of exoskeletons motor intent decoding schemes remain in making a continuous prediction to compensate for the hysteretic response caused by mechanical transmission.

Prediction

Take More Positives: An Empirical Study of Contrastive Learing in Unsupervised Person Re-Identification

no code implementations12 Jan 2021 Xuanyu He, Wei zhang, Ran Song, Qian Zhang, Xiangyuan Lan, Lin Ma

By studying two unsupervised person re-ID methods in a cross-method way, we point out a hard negative problem is handled implicitly by their designs of data augmentations and PK sampler respectively.

Contrastive Learning Unsupervised Person Re-Identification

Regularized Fine-grained Meta Face Anti-spoofing

1 code implementation25 Nov 2019 Rui Shao, Xiangyuan Lan, Pong C. Yuen

Besides, to further enhance the generalization ability of our model, the proposed framework adopts a fine-grained learning strategy that simultaneously conducts meta-learning in a variety of domain shift scenarios in each iteration.

Domain Generalization Face Anti-Spoofing +2

Multi-Cue Visual Tracking Using Robust Feature-Level Fusion Based on Joint Sparse Representation

no code implementations CVPR 2014 Xiangyuan Lan, Andy J. Ma, Pong C. Yuen

The use of multiple features for tracking has been proved as an effective approach because limitation of each feature could be compensated.

Visual Tracking

Cannot find the paper you are looking for? You can Submit a new open access paper.