no code implementations • 24 Jul 2022 • Oiwi Parker Jones, Brendan Shillingford
In contrast to the older writing system of the 19th century, modern Hawaiian orthography employs characters for long vowels and glottal stops.
2 code implementations • Nature 2022 • Yannis Assael, Thea Sommerschield, Brendan Shillingford, Mahyar Bordbar, John Pavlopoulos, Marita Chatzipanagiotou, Ion Androutsopoulos, Jonathan Prag, Nando de Freitas
Ithaca can attribute inscriptions to their original location with an accuracy of 71% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history.
Ranked #1 on
Ancient Text Restoration
on I.PHI
no code implementations • CVPR 2022 • Michael Hassid, Michelle Tadmor Ramanovich, Brendan Shillingford, Miaosen Wang, Ye Jia, Tal Remez
In this paper we present VDTTS, a Visually-Driven Text-to-Speech model.
no code implementations • 1 Jul 2021 • Brendan Shillingford, Yannis Assael, Misha Denil
This work describes an interactive decoding method to improve the performance of visual speech recognition systems using user input to compensate for the inherent ambiguity of the task.
no code implementations • 6 Nov 2020 • Yi Yang, Brendan Shillingford, Yannis Assael, Miaosen Wang, Wendi Liu, Yutian Chen, Yu Zhang, Eren Sezener, Luis C. Cobo, Misha Denil, Yusuf Aytar, Nando de Freitas
The visual content is translated by synthesizing lip movements for the speaker to match the translated audio, creating a seamless audiovisual experience in the target language.
1 code implementation • 8 Nov 2019 • Takaki Makino, Hank Liao, Yannis Assael, Brendan Shillingford, Basilio Garcia, Otavio Braga, Olivier Siohan
This work presents a large-scale audio-visual speech recognition system based on a recurrent neural network transducer (RNN-T) architecture.
Ranked #5 on
Audio-Visual Speech Recognition
on LRS3-TED
(using extra training data)
1 code implementation • ACL 2020 • Oana-Maria Camburu, Brendan Shillingford, Pasquale Minervini, Thomas Lukasiewicz, Phil Blunsom
To increase trust in artificial intelligence systems, a promising research direction consists of designing neural models capable of generating natural language explanations for their predictions.
no code implementations • 5 Jul 2019 • Archit Gupta, Brendan Shillingford, Yannis Assael, Thomas C. Walters
This paper proposes an approach where a communication node can instead extend the bandwidth of a band-limited incoming speech signal that may have been passed through a low-rate codec.
no code implementations • EMNLP 2018 • Brendan Shillingford, Oiwi Parker Jones
In contrast to the older writing system of the 19th century, modern Hawaiian orthography employs characters for long vowels and glottal stops.
no code implementations • ICLR 2019 • Yutian Chen, Yannis Assael, Brendan Shillingford, David Budden, Scott Reed, Heiga Zen, Quan Wang, Luis C. Cobo, Andrew Trask, Ben Laurie, Caglar Gulcehre, Aäron van den Oord, Oriol Vinyals, Nando de Freitas
Instead, the aim is to produce a network that requires few data at deployment time to rapidly adapt to new speakers.
no code implementations • ICLR 2019 • Brendan Shillingford, Yannis Assael, Matthew W. Hoffman, Thomas Paine, Cían Hughes, Utsav Prabhu, Hank Liao, Hasim Sak, Kanishka Rao, Lorrayne Bennett, Marie Mulville, Ben Coppin, Ben Laurie, Andrew Senior, Nando de Freitas
To achieve this, we constructed the largest existing visual speech recognition dataset, consisting of pairs of text and video clips of faces speaking (3, 886 hours of video).
Ranked #10 on
Lipreading
on LRS3-TED
(using extra training data)
no code implementations • NeurIPS 2017 • Rui Ponte Costa, Yannis M. Assael, Brendan Shillingford, Nando de Freitas, Tim P. Vogels
Cortical circuits exhibit intricate recurrent architectures that are remarkably similar across different brain areas.
12 code implementations • 5 Nov 2016 • Yannis M. Assael, Brendan Shillingford, Shimon Whiteson, Nando de Freitas
Lipreading is the task of decoding text from the movement of a speaker's mouth.
Ranked #5 on
Lipreading
on GRID corpus (mixed-speech)
8 code implementations • NeurIPS 2016 • Marcin Andrychowicz, Misha Denil, Sergio Gomez, Matthew W. Hoffman, David Pfau, Tom Schaul, Brendan Shillingford, Nando de Freitas
The move from hand-designed features to learned features in machine learning has been wildly successful.