Search Results for author: Leigang Qu

Found 7 papers, 4 papers with code

Discriminative Probing and Tuning for Text-to-Image Generation

no code implementations • 7 Mar 2024 • Leigang Qu, Wenjie Wang, Yongqi Li, Hanwang Zhang, Liqiang Nie, Tat-Seng Chua

We present a discriminative adapter built on T2I models to probe their discriminative abilities on two representative tasks and leverage discriminative fine-tuning to improve their text-image alignment.

Text-to-Image Generation

Paper
Add Code

Generative Cross-Modal Retrieval: Memorizing Images in Multimodal Language Models for Retrieval and Beyond

no code implementations • 16 Feb 2024 • Yongqi Li, Wenjie Wang, Leigang Qu, Liqiang Nie, Wenjie Li, Tat-Seng Chua

Building upon this capability, we propose to enable multimodal large language models (MLLMs) to memorize and recall images within their parameters.

Cross-Modal Retrieval Retrieval

Paper
Add Code

NExT-GPT: Any-to-Any Multimodal LLM

1 code implementation • 11 Sep 2023 • Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-Seng Chua

While recently Multimodal Large Language Models (MM-LLMs) have made exciting strides, they mostly fall prey to the limitation of only input-side multimodal understanding, without the ability to produce content in multiple modalities.

2,860

Paper
Code

LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation

no code implementations • 9 Aug 2023 • Leigang Qu, Shengqiong Wu, Hao Fei, Liqiang Nie, Tat-Seng Chua

Afterward, we propose a fine-grained object-interaction diffusion method to synthesize high-faithfulness images conditioned on the prompt and the automatically generated layout.

In-Context Learning Text-to-Image Generation

Paper
Add Code

Learnable Pillar-based Re-ranking for Image-Text Retrieval

1 code implementation • 25 Apr 2023 • Leigang Qu, Meng Liu, Wenjie Wang, Zhedong Zheng, Liqiang Nie, Tat-Seng Chua

Image-text retrieval aims to bridge the modality gap and retrieve cross-modal content based on semantic similarities.

Re-Ranking Retrieval +1

Paper
Code

Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization

1 code implementation • 14 Nov 2022 • Yiyang Chen, Zhedong Zheng, Wei Ji, Leigang Qu, Tat-Seng Chua

The key idea underpinning the proposed method is to integrate fine- and coarse-grained retrieval as matching data points with small and large fluctuations, respectively.

Ranked #3 on Image Retrieval with Multi-Modal Query on Fashion200k

Composed Image Retrieval (CoIR) Image Retrieval with Multi-Modal Query +1

Paper
Code

Dynamic Modality Interaction Modeling for Image-Text Retrieval

1 code implementation • ACM Special Interest Group on Information Retrieval 2021 • Leigang Qu, Meng Liu, Jianlong Wu, Zan Gao, Liqiang Nie

To address these issues, we develop a novel modality interaction modeling network based upon the routing mechanism, which is the first unified and dynamic multimodal interaction framework towards image-text retrieval.

Cross-Modal Retrieval Information Retrieval +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.