no code implementations • 9 Mar 2024 • Yudong Yang, Rongfeng Su, Xiaokang Liu, Nan Yan, Lan Wang
In this model, the inherent acoustic characteristics of individuals related to the tongue motion details are encoded by using wav2vec 2. 0, while the ASR transcriptions related to the universality of tongue motions are encoded by using BERT.