no code implementations • 4 Nov 2022 • Hsuan-Jui Chen, Yen Meng, Hung-Yi Lee
The sequence length along the time axis is often the dominant factor of the computation in speech processing.
no code implementations • 13 Oct 2022 • Yen Meng, Hsuan-Jui Chen, Jiatong Shi, Shinji Watanabe, Paola Garcia, Hung-Yi Lee, Hao Tang
Subsampling while training self-supervised models not only improves the overall performance on downstream tasks under certain frame rates, but also brings significant speed-up in inference.
no code implementations • 15 Oct 2021 • Yen Meng, Yi-Hui Chou, Andy T. Liu, Hung-Yi Lee
Self-supervised Speech Models (S3Ms) have been proven successful in many speech downstream tasks, like ASR.