no code implementations • 2 Mar 2024 • Junwen Xiong, Peng Zhang, Tao You, Chuanyue Li, Wei Huang, Yufei zha
Audio-visual saliency prediction can draw support from diverse modality complements, but further performance enhancement is still challenged by customized architectures as well as task-specific loss functions.
no code implementations • 15 Sep 2023 • Junwen Xiong, Peng Zhang, Chuanyue Li, Wei Huang, Yufei zha, Tao You
While many approaches have crafted task-specific training paradigms for either video saliency prediction or video salient object detection tasks, few attention has been devoted to devising a generalized saliency modeling framework that seamlessly bridges both these distinct tasks.
no code implementations • 8 Jul 2023 • Ganglai Wang, Peng Zhang, Junwen Xiong, Feihan Yang, Wei Huang, Yufei zha
DeepFake based digital facial forgery is threatening public media security, especially when lip manipulation has been used in talking face generation, and the difficulty of fake video detection is further improved.
no code implementations • 11 Mar 2023 • Junwen Xiong, Ganglai Wang, Peng Zhang, Wei Huang, Yufei zha, Guangtao Zhai
Incorporating the audio stream enables Video Saliency Prediction (VSP) to imitate the selective attention mechanism of human brain.
no code implementations • CVPR 2023 • Junwen Xiong, Ganglai Wang, Peng Zhang, Wei Huang, Yufei zha, Guangtao Zhai
Incorporating the audio stream enables Video Saliency Prediction (VSP) to imitate the selective attention mechanism of human brain.
no code implementations • 5 Mar 2022 • Junwen Xiong, Peng Zhang, Lei Xie, Wei Huang, Yufei zha, Yanning Zhang
Multi-modal based speech separation has exhibited a specific advantage on isolating the target character in multi-talker noisy environments.
1 code implementation • 4 Mar 2022 • Junwen Xiong, Yu Zhou, Peng Zhang, Lei Xie, Wei Huang, Yufei zha
Active speaker detection and speech enhancement have become two increasingly attractive topics in audio-visual scenario understanding.