Search Results for author: Wenxuan Hou

Found 1 papers, 1 papers with code

Progressive Spatio-temporal Perception for Audio-Visual Question Answering

1 code implementation10 Aug 2023 Guangyao Li, Wenxuan Hou, Di Hu

Such naturally multi-modal videos are composed of rich and complex dynamic audio-visual components, where most of which could be unrelated to the given questions, or even play as interference in answering the content of interest.

Audio-visual Question Answering Audio-Visual Question Answering (AVQA) +2

Cannot find the paper you are looking for? You can Submit a new open access paper.