no code implementations • 2 Apr 2019 • Stavros Petridis, Yujiang Wang, Pingchuan Ma, Zuwei Li, Maja Pantic
In this work, we present an end-to-end visual speech recognition system based on fully-connected layers and Long-Short Memory (LSTM) networks which is suitable for small-scale datasets.
no code implementations • 12 Sep 2017 • Stavros Petridis, Yujiang Wang, Zuwei Li, Maja Pantic
To the best of our knowledge, this is the first audiovisual fusion model which simultaneously learns to extract features directly from the pixels and spectrograms and perform classification of speech and nonlinguistic vocalisations.
no code implementations • 1 Sep 2017 • Stavros Petridis, Yujiang Wang, Zuwei Li, Maja Pantic
To the best of our knowledge, this is the first model which simultaneously learns to extract features directly from the pixels and performs visual speech classification from multiple views and also achieves state-of-the-art performance.
no code implementations • 20 Jan 2017 • Stavros Petridis, Zuwei Li, Maja Pantic
Recently, several deep learning approaches have been presented which automatically extract features from the mouth images and aim to replace the feature extraction stage.