Search Results for author: Xiao-Yong Wei

Found 10 papers, 5 papers with code

A Picture Is Worth a Graph: Blueprint Debate on Graph for Multimodal Reasoning

no code implementations • 22 Mar 2024 • Changmeng Zheng, Dayong Liang, WengYu Zhang, Xiao-Yong Wei, Tat-Seng Chua, Qing Li

The study addresses two key challenges: the trivialization of opinions resulting from excessive summarization and the diversion of focus caused by distractor concepts introduced from images.

Multimodal Reasoning

Paper
Add Code

Generative Active Learning for Image Synthesis Personalization

1 code implementation • 22 Mar 2024 • Xulu Zhang, WengYu Zhang, Xiao-Yong Wei, Jinlin Wu, Zhaoxiang Zhang, Zhen Lei, Qing Li

The primary challenge in conducting active learning on generative models lies in the open-ended nature of querying, which differs from the closed form of querying in discriminative models that typically target a single concept.

Active Learning Image Generation

Paper
Code

Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue

no code implementations • 10 Feb 2024 • Jian Wang, Chak Tou Leong, Jiashuo Wang, Dongding Lin, Wenjie Li, Xiao-Yong Wei

Tuning pretrained language models for dialogue generation has been a prevalent paradigm for building capable dialogue agents.

Dialogue Generation

Paper
Add Code

Compositional Inversion for Stable Diffusion Models

1 code implementation • 13 Dec 2023 • Xulu Zhang, Xiao-Yong Wei, Jinlin Wu, Tianyi Zhang, Zhaoxiang Zhang, Zhen Lei, Qing Li

It stems from the fact that during inversion, the irrelevant semantics in the user images are also encoded, forcing the inverted concepts to occupy locations far from the core distribution in the embedding space.

Paper
Code

Untargeted Black-box Attacks for Social Recommendations

no code implementations • 13 Nov 2023 • Wenqi Fan, Shijie Wang, Xiao-Yong Wei, Xiaowei Mei, Qing Li

To perform untargeted attacks on social recommender systems, attackers can construct malicious social relationships for fake users to enhance the attack performance.

Decision Making Multi-agent Reinforcement Learning +1

Paper
Add Code

Empowering Molecule Discovery for Molecule-Caption Translation with Large Language Models: A ChatGPT Perspective

1 code implementation • 11 Jun 2023 • Jiatong Li, Yunqing Liu, Wenqi Fan, Xiao-Yong Wei, Hui Liu, Jiliang Tang, Qing Li

In this work, we propose a novel LLM-based framework (MolReGPT) for molecule-caption translation, where an In-Context Few-Shot Molecule Learning paradigm is introduced to empower molecule discovery with LLMs like ChatGPT to perform their in-context learning capability without domain-specific pre-training and fine-tuning.

Ranked #3 on Text-based de novo Molecule Generation on ChEBI-20

Molecule Captioning Natural Language Understanding +2

Paper
Code

Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey

1 code implementation • 20 Feb 2023 • Xiao Wang, Guangyao Chen, Guangwu Qian, Pengcheng Gao, Xiao-Yong Wei, YaoWei Wang, Yonghong Tian, Wen Gao

We also give visualization and analysis of the model parameters and results on representative downstream tasks.

249

Paper
Code

Conceptor Learning for Class Activation Mapping

no code implementations • 21 Jan 2022 • Guangwu Qian, Zhen-Qun Yang, Xu-Lu Zhang, YaoWei Wang, Qing Li, Xiao-Yong Wei

Class Activation Mapping (CAM) has been widely adopted to generate saliency maps which provides visual explanations for deep neural networks (DNNs).

Relation

Paper
Add Code

Attention on Attention for Image Captioning

5 code implementations • ICCV 2019 • Lun Huang, Wenmin Wang, Jie Chen, Xiao-Yong Wei

In this paper, we propose an Attention on Attention (AoA) module, which extends the conventional attention mechanisms to determine the relevance between attention results and queries.

Image Captioning

323

Paper
Code

ParNet: Position-aware Aggregated Relation Network for Image-Text matching

no code implementations • 17 Jun 2019 • Yaxian Xia, Lun Huang, Xiao-Yong Wei, Wenmin Wang

The first step, we call it intra-modal relation mechanism, in which we computes responses between different objects in an image or different words in a sentence separately; The second step, we call it inter-modal relation mechanism, in which the query plays a role of textual context to refine the relationship among object proposals in an image.

Image-text matching Position +5

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.