no code implementations • 12 Dec 2023 • Peiwen Sun, Yifan Zhang, Zishan Liu, Donghao Chen, Honggang Zhang
The vanilla fusion methods still dominate a large percentage of mainstream audio-visual tasks.
no code implementations • 9 Sep 2022 • Peiwen Sun, Shanshan Zhang, Zishan Liu, Yougen Yuan, Taotao Zhang, Honggang Zhang, Pengfei Hu
It has already been observed that audio-visual embedding is more robust than uni-modality embedding for person verification.