no code implementations • 10 Jul 2023 • Adriana Fernandez-Lopez, Honglie Chen, Pingchuan Ma, Alexandros Haliassos, Stavros Petridis, Maja Pantic
We evaluate our 50% sparse model on 7 different visual noise types and achieve an overall absolute improvement of more than 2% WER compared to the dense equivalent.
1 code implementation • 25 Mar 2023 • Pingchuan Ma, Alexandros Haliassos, Adriana Fernandez-Lopez, Honglie Chen, Stavros Petridis, Maja Pantic
Recently, the performance of automatic, visual, and audio-visual speech recognition (ASR, VSR, and AV-ASR, respectively) has been substantially improved, mainly due to the use of larger models and training sets.
Ranked #1 on Automatic Speech Recognition (ASR) on LRS3-TED
Audio-Visual Speech Recognition Automatic Speech Recognition +4
no code implementations • 26 Apr 2017 • Adriana Fernandez-Lopez, Oriol Martinez, Federico M. Sukno
On one hand, researchers have reported that the mapping between phonemes and visemes (visual units) is one-to-many because there are phonemes which are visually similar and indistinguishable between them.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 26 Apr 2017 • Adriana Fernandez-Lopez, Federico M. Sukno
Our results indicate that we are able to recognize approximately 58% of the visemes, 47% of the phonemes and 23% of the words in a continuous speech scenario and that the optimal viseme vocabulary for Spanish is composed by 20 visemes.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3