Search Results for author: Changwen Chen

Found 8 papers, 3 papers with code

MoPE: Parameter-Efficient and Scalable Multimodal Fusion via Mixture of Prompt Experts

1 code implementation • 14 Mar 2024 • Ruixiang Jiang, Lingbo Liu, Changwen Chen

Building upon this disentanglement, we introduce the mixture of prompt experts (MoPE) technique to enhance expressiveness.

Disentanglement Multimodal Deep Learning +1

Paper
Code

GPT4SGG: Synthesizing Scene Graphs from Holistic and Region-specific Narratives

no code implementations • 7 Dec 2023 • Zuyao Chen, Jinlin Wu, Zhen Lei, Zhaoxiang Zhang, Changwen Chen

Learning scene graphs from natural language descriptions has proven to be a cheap and promising scheme for Scene Graph Generation (SGG).

Graph Generation Scene Graph Generation +1

Paper
Add Code

Conditional Prompt Tuning for Multimodal Fusion

1 code implementation • 28 Nov 2023 • Ruixiang Jiang, Lingbo Liu, Changwen Chen

We show that the representation of one modality can effectively guide the prompting of another modality for parameter-efficient multimodal fusion.

Paper
Code

Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention

no code implementations • 18 Nov 2023 • Zuyao Chen, Jinlin Wu, Zhen Lei, Zhaoxiang Zhang, Changwen Chen

For the more challenging settings of relation-involved open vocabulary SGG, the proposed approach integrates relation-aware pre-training utilizing image-caption data and retains visual-concept alignment through knowledge distillation.

Concept Alignment Graph Generation +6

Paper
Add Code

CLIP-Count: Towards Text-Guided Zero-Shot Object Counting

1 code implementation • 12 May 2023 • Ruixiang Jiang, Lingbo Liu, Changwen Chen

Specifically, we propose CLIP-Count, the first end-to-end pipeline that estimates density maps for open-vocabulary objects with text guidance in a zero-shot manner.

Ranked #3 on Cross-Part Crowd Counting on ShanghaiTech A

Cross-Part Crowd Counting Cross-Part Evaluation +6

Paper
Code

Fast-ParC: Capturing Position Aware Global Feature for ConvNets and ViTs

no code implementations • 8 Oct 2022 • Tao Yang, Haokui Zhang, Wenze Hu, Changwen Chen, Xiaoyu Wang

Transformer models have made tremendous progress in various fields in recent years.

Image Classification Inductive Bias +1

Paper
Add Code

Prompt-based Learning for Unpaired Image Captioning

no code implementations • 26 May 2022 • Peipei Zhu, Xiao Wang, Lin Zhu, Zhenglong Sun, Weishi Zheng, YaoWei Wang, Changwen Chen

Inspired by the success of Vision-Language Pre-Trained Models (VL-PTMs) in this research, we attempt to infer the cross-domain cue information about a given image from the large VL-PTMs for the UIC task.

Image Captioning Question Answering +2

Paper
Add Code

Unpaired Image Captioning by Image-level Weakly-Supervised Visual Concept Recognition

no code implementations • 7 Mar 2022 • Peipei Zhu, Xiao Wang, Yong Luo, Zhenglong Sun, Wei-Shi Zheng, YaoWei Wang, Changwen Chen

The image-level labels are utilized to train a weakly-supervised object recognition model to extract object information (e. g., instance) in an image, and the extracted instances are adopted to infer the relationships among different objects based on an enhanced graph neural network (GNN).

Image Captioning Object +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.