Search Results for author: Wenke Xia

Found 5 papers, 5 papers with code

Robust Cross-Modal Knowledge Distillation for Unconstrained Videos

1 code implementation16 Apr 2023 Wenke Xia, Xingjian Li, Andong Deng, Haoyi Xiong, Dejing Dou, Di Hu

However, such semantic consistency from the synchronization is hard to guarantee in unconstrained videos, due to the irrelevant modality noise and differentiated semantic correlation.

Action Recognition Audio Tagging +3

Balanced Audiovisual Dataset for Imbalance Analysis

1 code implementation14 Feb 2023 Wenke Xia, Xu Zhao, Xincheng Pang, Changqing Zhang, Di Hu

We surprisingly find that: the multimodal models with existing imbalance algorithms consistently perform worse than the unimodal one on specific subsets, in accordance with the modality bias.

Revisiting Pre-training in Audio-Visual Learning

1 code implementation7 Feb 2023 Ruoxuan Feng, Wenke Xia, Di Hu

Specifically, we explore the effects of pre-trained models on two audio-visual learning scenarios: cross-modal initialization and multi-modal joint learning.

audio-visual learning

TikTalk: A Video-Based Dialogue Dataset for Multi-Modal Chitchat in Real World

1 code implementation14 Jan 2023 Hongpeng Lin, Ludan Ruan, Wenke Xia, Peiyu Liu, Jingyuan Wen, Yixin Xu, Di Hu, Ruihua Song, Wayne Xin Zhao, Qin Jin, Zhiwu Lu

Experimental results indicate that the models incorporating large language models (LLM) can generate more diverse responses, while the model utilizing knowledge graphs to introduce external knowledge performs the best overall.

Knowledge Graphs

Cannot find the paper you are looking for? You can Submit a new open access paper.