1 code implementation • 19 Sep 2023 • Yuan Tseng, Layne Berry, Yi-Ting Chen, I-Hsiang Chiu, Hsuan-Hao Lin, Max Liu, Puyuan Peng, Yi-Jen Shih, Hung-Yu Wang, Haibin Wu, Po-Yao Huang, Chun-Mao Lai, Shang-Wen Li, David Harwath, Yu Tsao, Shinji Watanabe, Abdelrahman Mohamed, Chi-Luen Feng, Hung-Yi Lee
Audio-visual representation learning aims to develop systems with human-like perception by utilizing correlation between auditory and visual information.
1 code implementation • 13 Oct 2022 • Guan-Ting Lin, Chi-Luen Feng, Wei-Ping Huang, Yuan Tseng, Tzu-Han Lin, Chen-An Li, Hung-Yi Lee, Nigel G. Ward
We find that 13 of the 15 SSL models outperformed the baseline on all the prosody-related tasks.
no code implementations • 8 May 2022 • Chi-Luen Feng, Po-chun Hsu, Hung-Yi Lee
We found that HuBERT stores speaker information in representations whose positions correspond to silences in a waveform.