1 code implementation • CVPR 2023 • Junhua Liao, Haihan Duan, Kanghui Feng, Wanbing Zhao, Yanbing Yang, Liangyin Chen
Experimental results on the AVA-ActiveSpeaker dataset show that our framework achieves competitive mAP performance (94. 1% vs. 94. 2%), while the resource costs are significantly lower than the state-of-the-art method, especially in model parameters (1. 0M vs. 22. 5M, about 23x) and FLOPs (0. 6G vs. 2. 6G, about 4x).
Audio-Visual Active Speaker Detection speaker-diarization +2