Search Results for author: Longhui Wei

Found 36 papers, 18 papers with code

OVMR: Open-Vocabulary Recognition with Multi-Modal References

1 code implementation CVPR 2024 Zehong Ma, Shiliang Zhang, Longhui Wei, Qi Tian

Existing works have proposed different methods to embed category cues into the model, \eg, through few-shot fine-tuning, providing category names or textual descriptions to Vision-Language Models.

Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model

1 code implementation CVPR 2024 Zhicai Wang, Longhui Wei, Tan Wang, Heyu Chen, Yanbin Hao, Xiang Wang, Xiangnan He, Qi Tian

Text-to-image (T2I) generative models have recently emerged as a powerful tool, enabling the creation of photo-realistic images and giving rise to a multitude of applications.

Data Augmentation Image Classification

Incorporating Visual Experts to Resolve the Information Loss in Multimodal Large Language Models

no code implementations6 Jan 2024 Xin He, Longhui Wei, Lingxi Xie, Qi Tian

Multimodal Large Language Models (MLLMs) are experiencing rapid growth, yielding a plethora of noteworthy contributions in recent months.

Instruction Following

Boosting Segment Anything Model Towards Open-Vocabulary Learning

1 code implementation6 Dec 2023 Xumeng Han, Longhui Wei, Xuehui Yu, Zhiyang Dou, Xin He, Kuiran Wang, Zhenjun Han, Qi Tian

The recent Segment Anything Model (SAM) has emerged as a new paradigmatic vision foundation model, showcasing potent zero-shot generalization and flexible prompting.

Object Object Localization +2

HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data

1 code implementation CVPR 2024 Qifan Yu, Juncheng Li, Longhui Wei, Liang Pang, Wentao Ye, Bosheng Qin, Siliang Tang, Qi Tian, Yueting Zhuang

Multi-modal Large Language Models (MLLMs) tuned on machine-generated instruction-following data have demonstrated remarkable performance in various multi-modal understanding and generation tasks.

Attribute counterfactual +3

Degeneration-Tuning: Using Scrambled Grid shield Unwanted Concepts from Stable Diffusion

no code implementations2 Aug 2023 Zixuan Ni, Longhui Wei, Jiacheng Li, Siliang Tang, Yueting Zhuang, Qi Tian

In this work, we propose a novel strategy named \textbf{Degeneration-Tuning (DT)} to shield contents of unwanted concepts from SD weights.

Towards AGI in Computer Vision: Lessons Learned from GPT and Large Language Models

no code implementations14 Jun 2023 Lingxi Xie, Longhui Wei, Xiaopeng Zhang, Kaifeng Bi, Xiaotao Gu, Jianlong Chang, Qi Tian

In this paper, we start with a conceptual definition of AGI and briefly review how NLP solves a wide range of tasks via a chat system.

Continual Vision-Language Representation Learning with Off-Diagonal Information

no code implementations11 May 2023 Zixuan Ni, Longhui Wei, Siliang Tang, Yueting Zhuang, Qi Tian

Moreover, we empirically and theoretically demonstrate how SD leads to a performance decline for CLIP on cross-modal retrieval tasks.

Continual Learning Contrastive Learning +4

Learning Transferable Pedestrian Representation from Multimodal Information Supervision

1 code implementation12 Apr 2023 Liping Bao, Longhui Wei, Xiaoyu Qiu, Wengang Zhou, Houqiang Li, Qi Tian

Recent researches on unsupervised person re-identification~(reID) have demonstrated that pre-training on unlabeled person images achieves superior performance on downstream reID tasks than pre-training on ImageNet.

Attribute Contrastive Learning +3

Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models

no code implementations ICCV 2023 Juncheng Li, Minghe Gao, Longhui Wei, Siliang Tang, Wenqiao Zhang, Mengze Li, Wei Ji, Qi Tian, Tat-Seng Chua, Yueting Zhuang

Prompt tuning, a recently emerging paradigm, enables the powerful vision-language pre-training models to adapt to downstream tasks in a parameter -- and data -- efficient way, by learning the ``soft prompts'' to condition frozen pre-training models.

Domain Generalization Few-Shot Learning +1

Lformer: Text-to-Image Generation with L-shape Block Parallel Decoding

no code implementations7 Mar 2023 Jiacheng Li, Longhui Wei, Zongyuan Zhan, Xin He, Siliang Tang, Qi Tian, Yueting Zhuang

To better accelerate the generative transformers while keeping good generation quality, we propose Lformer, a semi-autoregressive text-to-image generation model.

Text-to-Image Generation

Dilated Context Integrated Network with Cross-Modal Consensus for Temporal Emotion Localization in Videos

1 code implementation3 Aug 2022 Juncheng Li, Junlin Xie, Linchao Zhu, Long Qian, Siliang Tang, Wenqiao Zhang, Haochen Shi, Shengyu Zhang, Longhui Wei, Qi Tian, Yueting Zhuang

In this paper, we introduce a new task, named Temporal Emotion Localization in videos~(TEL), which aims to detect human emotions and localize their corresponding temporal boundaries in untrimmed videos with aligned subtitles.

Emotion Classification Temporal Action Localization +1

DE-Net: Dynamic Text-guided Image Editing Adversarial Networks

1 code implementation2 Jun 2022 Ming Tao, Bing-Kun Bao, Hao Tang, Fei Wu, Longhui Wei, Qi Tian

To solve these limitations, we propose: (i) a Dynamic Editing Block (DEBlock) which composes different editing modules dynamically for various editing requirements.

text-guided-image-editing

MVP: Multimodality-guided Visual Pre-training

no code implementations10 Mar 2022 Longhui Wei, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian

Recently, masked image modeling (MIM) has become a promising direction for visual pre-training.

Language Modelling

Learning To Learn by Jointly Optimizing Neural Architecture and Weights

no code implementations CVPR 2022 Yadong Ding, Yu Wu, Chengyue Huang, Siliang Tang, Yi Yang, Longhui Wei, Yueting Zhuang, Qi Tian

Existing NAS-based meta-learning methods apply a two-stage strategy, i. e., first searching architectures and then re-training meta-weights on the searched architecture.

Meta-Learning

Revisiting Catastrophic Forgetting in Class Incremental Learning

no code implementations26 Jul 2021 Zixuan Ni, Haizhou Shi, Siliang Tang, Longhui Wei, Qi Tian, Yueting Zhuang

After investigating existing strategies, we observe that there is a lack of study on how to prevent the inter-phase confusion.

Class Incremental Learning Contrastive Learning +2

Exploring the Diversity and Invariance in Yourself for Visual Pre-Training Task

no code implementations1 Jun 2021 Longhui Wei, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian

By simply pulling the different augmented views of each image together or other novel mechanisms, they can learn much unsupervised knowledge and significantly improve the transfer performance of pre-training models.

Self-Supervised Learning

Visformer: The Vision-friendly Transformer

5 code implementations ICCV 2021 Zhengsu Chen, Lingxi Xie, Jianwei Niu, Xuefeng Liu, Longhui Wei, Qi Tian

The past year has witnessed the rapid development of applying the Transformer module to vision problems.

Image Classification

UnrealPerson: An Adaptive Pipeline towards Costless Person Re-identification

1 code implementation CVPR 2021 Tianyu Zhang, Lingxi Xie, Longhui Wei, Zijie Zhuang, Yongfei Zhang, Bo Li, Qi Tian

The main difficulty of person re-identification (ReID) lies in collecting annotated data and transferring the model across different domains.

Domain Adaptation Image Generation +1

Heterogeneous Contrastive Learning: Encoding Spatial Information for Compact Visual Representations

no code implementations19 Nov 2020 Xinyue Huo, Lingxi Xie, Longhui Wei, Xiaopeng Zhang, Hao Li, Zijie Yang, Wengang Zhou, Houqiang Li, Qi Tian

Contrastive learning has achieved great success in self-supervised visual representation learning, but existing approaches mostly ignored spatial information which is often crucial for visual representation.

Contrastive Learning Data Augmentation +1

GOLD-NAS: Gradual, One-Level, Differentiable

1 code implementation7 Jul 2020 Kaifeng Bi, Lingxi Xie, Xin Chen, Longhui Wei, Qi Tian

There has been a large literature of neural architecture search, but most existing work made use of heuristic rules that largely constrained the search flexibility.

Image Classification Neural Architecture Search

Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks

no code implementations17 Apr 2020 Xin Chen, Lingxi Xie, Jun Wu, Longhui Wei, Yuhui Xu, Qi Tian

We alleviate this issue by training a graph convolutional network to fit the performance of sampled sub-networks so that the impact of random errors becomes minimal.

Neural Architecture Search

Circumventing Outliers of AutoAugment with Knowledge Distillation

1 code implementation ECCV 2020 Longhui Wei, An Xiao, Lingxi Xie, Xin Chen, Xiaopeng Zhang, Qi Tian

AutoAugment has been a powerful algorithm that improves the accuracy of many vision tasks, yet it is sensitive to the operator space as well as hyper-parameters, and an improper setting may degenerate network optimization.

Data Augmentation General Classification +2

Stabilizing DARTS with Amended Gradient Estimation on Architectural Parameters

1 code implementation25 Oct 2019 Kaifeng Bi, Changping Hu, Lingxi Xie, Xin Chen, Longhui Wei, Qi Tian

Our approach bridges the gap from two aspects, namely, amending the estimation on the architectural gradients, and unifying the hyper-parameter settings in the search and re-training stages.

Neural Architecture Search

Single Camera Training for Person Re-identification

1 code implementation24 Sep 2019 Tianyu Zhang, Lingxi Xie, Longhui Wei, Yongfei Zhang, Bo Li, Qi Tian

Differently, this paper investigates ReID in an unexplored single-camera-training (SCT) setting, where each person in the training set appears in only one camera.

Metric Learning Person Re-Identification

Person Transfer GAN to Bridge Domain Gap for Person Re-Identification

24 code implementations CVPR 2018 Longhui Wei, Shiliang Zhang, Wen Gao, Qi Tian

Although the performance of person Re-Identification (ReID) has been significantly boosted, many challenging issues in real scenarios have not been fully investigated, e. g., the complex scenes and lighting variations, viewpoint and pose changes, and the large number of identities in a camera network.

Generative Adversarial Network Person Re-Identification +1

GLAD: Global-Local-Alignment Descriptor for Pedestrian Retrieval

no code implementations13 Sep 2017 Longhui Wei, Shiliang Zhang, Hantao Yao, Wen Gao, Qi Tian

Targeting to solve these problems, this work proposes a Global-Local-Alignment Descriptor (GLAD) and an efficient indexing and retrieval framework, respectively.

Person Re-Identification Representation Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.