Search Results for author: Theresa Breiner

Found 6 papers, 1 papers with code

Mining Large-Scale Low-Resource Pronunciation Data From Wikipedia

no code implementations27 Jan 2021 Tania Chakraborty, Manasa Prasad, Theresa Breiner, Sandy Ritchie, Daan van Esch

Pronunciation modeling is a key task for building speech technology in new languages, and while solid grapheme-to-phoneme (G2P) mapping systems exist, language coverage can stand to be improved.

Language ID in the Wild: Unexpected Challenges on the Path to a Thousand-Language Web Text Corpus

1 code implementation COLING 2020 Isaac Caswell, Theresa Breiner, Daan van Esch, Ankur Bapna

Large text corpora are increasingly important for a wide variety of Natural Language Processing (NLP) tasks, and automatic language identification (LangID) is a core technology needed to collect such datasets in a multilingual context.

Language Identification

Writing Across the World's Languages: Deep Internationalization for Gboard, the Google Keyboard

no code implementations3 Dec 2019 Daan van Esch, Elnaz Sarbar, Tamar Lucassen, Jeremy O'Brien, Theresa Breiner, Manasa Prasad, Evan Crew, Chieu Nguyen, Françoise Beaufays

Today, Gboard supports 900+ language varieties across 70+ writing systems, and this report describes how and why we have been adding support for hundreds of language varieties from around the globe.

Automatic Keyboard Layout Design for Low-Resource Latin-Script Languages

no code implementations18 Jan 2019 Theresa Breiner, Chieu Nguyen, Daan van Esch, Jeremy O'Brien

For many speakers, one of the barriers in accessing and creating text content on the web is the absence of input tools for their language.

Cannot find the paper you are looking for? You can Submit a new open access paper.