no code implementations • 27 Oct 2021 • Bowen Wu, Zhenyu Xie, Xiaodan Liang, Yubei Xiao, Haoye Dong, Liang Lin
The integration of human parsing and appearance flow effectively guides the generation of video frames with realistic appearance.
no code implementations • Findings (EMNLP) 2021 • Guolin Zheng, Yubei Xiao, Ke Gong, Pan Zhou, Xiaodan Liang, Liang Lin
Specifically, we unify a pre-trained acoustic model (wav2vec 2. 0) and a language model (BERT) into an end-to-end trainable framework.
no code implementations • 22 Dec 2020 • Yubei Xiao, Ke Gong, Pan Zhou, Guolin Zheng, Xiaodan Liang, Liang Lin
When sampling tasks in MML-ASR, AMS adaptively determines the task sampling probability for each source language.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3