1 code implementation • 13 Dec 2023 • Shengsheng Qian, Yifei Wang, Dizhan Xue, Shengjie Zhang, Huaiwen Zhang, Changsheng Xu
After obtaining the threat model trained on the poisoned dataset, our method can precisely detect poisonous samples based on the assumption that masking the backdoor trigger can effectively change the activation of a downstream clustering model.
1 code implementation • 5 Sep 2023 • Dizhan Xue, Shengsheng Qian, Zuyi Zhou, Changsheng Xu
In recent years, cross-modal reasoning (CMR), the process of understanding and reasoning across different modalities, has emerged as a pivotal area with applications spanning from multimedia analysis to healthcare diagnostics.
no code implementations • WWW 2023 • Shengsheng Qian, Hong Chen, Dizhan Xue, Quan Fang, Changsheng Xu
To tackle these challenges, we propose an Open-World Social Event Classifier (OWSEC) model in this paper.
1 code implementation • ICCV 2023 • Dizhan Xue, Shengsheng Qian, Changsheng Xu
To address these issues, we propose a Variational Causal Inference Network (VCIN) that establishes the causal correlation between predicted answers and explanations, and captures cross-modal relationships to generate rational explanations.
Ranked #1 on Explanatory Visual Question Answering on GQA-REX
Explanation Generation Explanatory Visual Question Answering +2
1 code implementation • ACM MM 2022 • Dizhan Xue, Shengsheng Qian, Quan Fang, Changsheng Xu
Finally, a multimodal transformer decoder constructs attention among multimodal features to learn the story dependency and generates informative, reasonable, and coherent story endings.
Ranked #1 on Image-guided Story Ending Generation on LSMDC-E
1 code implementation • IEEE Transactions on Pattern Analysis and Machine Intelligence 2022 • Dizhan Xue, Shengsheng Qian, Quan Fang, Changsheng Xu
To date, most of the existing techniques mainly convert multimodal data into a common representation space where similarities in semantics between samples can be easily measured across multiple modalities.
1 code implementation • IEEE Transactions on Multimedia 2021 • Shengsheng Qian, Dizhan Xue, Quan Fang, Changsheng Xu
Firstly, we construct an instance representation learning branch to transform instances of different modalities into a common representation space.
1 code implementation • AAAI 2021 • Shengsheng Qian, Dizhan Xue, Huaiwen Zhang, Quan Fang, Changsheng Xu
To date, most existing methods transform multimodal data into a common representation space where semantic similarities between items can be directly measured across different modalities.