Search Results for author: Dizhan Xue

Found 8 papers, 7 papers with code

Erasing Self-Supervised Learning Backdoor by Cluster Activation Masking

1 code implementation • 13 Dec 2023 • Shengsheng Qian, Yifei Wang, Dizhan Xue, Shengjie Zhang, Huaiwen Zhang, Changsheng Xu

After obtaining the threat model trained on the poisoned dataset, our method can precisely detect poisonous samples based on the assumption that masking the backdoor trigger can effectively change the activation of a downstream clustering model.

backdoor defense Self-Supervised Learning

Paper
Code

A Survey on Interpretable Cross-modal Reasoning

1 code implementation • 5 Sep 2023 • Dizhan Xue, Shengsheng Qian, Zuyi Zhou, Changsheng Xu

In recent years, cross-modal reasoning (CMR), the process of understanding and reasoning across different modalities, has emerged as a pivotal area with applications spanning from multimedia analysis to healthcare diagnostics.

Cross-Modal Retrieval Decision Making +8

Paper
Code

Open-World Social Event Classification

no code implementations • WWW 2023 • Shengsheng Qian, Hong Chen, Dizhan Xue, Quan Fang, Changsheng Xu

To tackle these challenges, we propose an Open-World Social Event Classifier (OWSEC) model in this paper.

Classification Open-World Social Event Classification

Paper
Add Code

Variational Causal Inference Network for Explanatory Visual Question Answering

1 code implementation • ICCV 2023 • Dizhan Xue, Shengsheng Qian, Changsheng Xu

To address these issues, we propose a Variational Causal Inference Network (VCIN) that establishes the causal correlation between predicted answers and explanations, and captures cross-modal relationships to generate rational explanations.

Ranked #1 on Explanatory Visual Question Answering on GQA-REX

Explanation Generation Explanatory Visual Question Answering +2

Paper
Code

MMT: Image-guided Story Ending Generation with Multimodal Memory Transformer

1 code implementation • ACM MM 2022 • Dizhan Xue, Shengsheng Qian, Quan Fang, Changsheng Xu

Finally, a multimodal transformer decoder constructs attention among multimodal features to learn the story dependency and generates informative, reasonable, and coherent story endings.

Ranked #1 on Image-guided Story Ending Generation on LSMDC-E

Image Captioning Image-guided Story Ending Generation +2

Paper
Code

Integrating multi-label contrastive learning with dual adversarial graph neural networks for cross-modal retrieval

1 code implementation • IEEE Transactions on Pattern Analysis and Machine Intelligence 2022 • Dizhan Xue, Shengsheng Qian, Quan Fang, Changsheng Xu

To date, most of the existing techniques mainly convert multimodal data into a common representation space where similarities in semantics between samples can be easily measured across multiple modalities.

Contrastive Learning Cross-Modal Retrieval +1

Paper
Code

Adaptive label-aware graph convolutional networks for cross-modal retrieval

1 code implementation • IEEE Transactions on Multimedia 2021 • Shengsheng Qian, Dizhan Xue, Quan Fang, Changsheng Xu

Firstly, we construct an instance representation learning branch to transform instances of different modalities into a common representation space.

Cross-Modal Retrieval Representation Learning +1

Paper
Code

Dual adversarial graph neural networks for multi-label cross-modal retrieval

1 code implementation • AAAI 2021 • Shengsheng Qian, Dizhan Xue, Huaiwen Zhang, Quan Fang, Changsheng Xu

To date, most existing methods transform multimodal data into a common representation space where semantic similarities between items can be directly measured across different modalities.

Cross-Modal Retrieval Retrieval

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.