Search Results for author: Kyle Gorman

Found 22 papers, 3 papers with code

NeMo Inverse Text Normalization: From Development To Production

1 code implementation11 Apr 2021 Yang Zhang, Evelina Bakhturina, Kyle Gorman, Boris Ginsburg

Inverse text normalization (ITN) converts spoken-domain automatic speech recognition (ASR) output into written-domain text to improve the readability of the ASR output.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

We Need to Talk about Standard Splits

1 code implementation ACL 2019 Kyle Gorman, Steven Bedrick

It is standard practice in speech {\&} language technology to rank systems according to their performance on a test set held out for evaluation.

Minimally Supervised Written-to-Spoken Text Normalization

no code implementations21 Sep 2016 Ke Wu, Kyle Gorman, Richard Sproat

In speech-applications such as text-to-speech (TTS) or automatic speech recognition (ASR), \emph{text normalization} refers to the task of converting from a \emph{written} representation into a representation of how the text is to be \emph{spoken}.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Minimally Supervised Number Normalization

no code implementations TACL 2016 Kyle Gorman, Richard Sproat

We propose two models for verbalizing numbers, a key component in speech recognition and synthesis systems.

speech-recognition Speech Recognition +2

Target word prediction and paraphasia classification in spoken discourse

no code implementations WS 2017 Joel Adams, Steven Bedrick, Gerasimos Fergadiotis, Kyle Gorman, Jan van Santen

We present a system for automatically detecting and classifying phonologically anomalous productions in the speech of individuals with aphasia.

Classification General Classification +2

What Kind of Language Is Hard to Language-Model?

no code implementations ACL 2019 Sabrina J. Mielke, Ryan Cotterell, Kyle Gorman, Brian Roark, Jason Eisner

Trying to answer the question of what features difficult languages have in common, we try and fail to reproduce our earlier (Cotterell et al., 2018) observation about morphological complexity and instead reveal far simpler statistics of the data that seem to drive complexity in a much larger sample.

Language Modelling Sentence

Massively Multilingual Pronunciation Modeling with WikiPron

no code implementations LREC 2020 Jackson L. Lee, Lucas F.E. Ashby, M. Elizabeth Garza, Yeonju Lee-Sikka, Sean Miller, Alan Wong, Arya D. McCarthy, Kyle Gorman

We introduce WikiPron, an open-source command-line tool for extracting pronunciation data from Wiktionary, a collaborative multilingual online dictionary.

The SIGMORPHON 2020 Shared Task on Multilingual Grapheme-to-Phoneme Conversion

no code implementations WS 2020 Kyle Gorman, Lucas F.E. Ashby, Aaron Goyzueta, Arya McCarthy, Shijie Wu, Daniel You

We describe the design and findings of the SIGMORPHON 2020 shared task on multilingual grapheme-to-phoneme conversion.

Neural Models of Text Normalization for Speech Applications

no code implementations CL 2019 Hao Zhang, Richard Sproat, Axel H. Ng, Felix Stahlberg, Xiaochang Peng, Kyle Gorman, Brian Roark

One problem that has been somewhat resistant to effective machine learning solutions is text normalization for speech applications such as text-to-speech synthesis (TTS).

BIG-bench Machine Learning Speech Synthesis +1

Is the Best Better? Bayesian Statistical Model Comparison for Natural Language Processing

no code implementations EMNLP 2020 Piotr Szymański, Kyle Gorman

Recent work raises concerns about the use of standard splits to compare natural language processing models.

Structured abbreviation expansion in context

no code implementations Findings (EMNLP) 2021 Kyle Gorman, Christo Kirov, Brian Roark, Richard Sproat

Ad hoc abbreviations are commonly found in informal communication channels that favor shorter messages.

Spelling Correction

Group-matching algorithms for subjects and items

no code implementations9 Oct 2021 Géza Kiss, Kyle Gorman, Jan P. H. van Santen

We consider the problem of constructing matched groups such that the resulting groups are statistically similar with respect to their average values for multiple covariates.

A* shortest string decoding for non-idempotent semirings

no code implementations14 Apr 2022 Kyle Gorman, Cyril Allauzen

We describe an algorithm which finds the shortest string for a weighted non-deterministic automaton over such semirings using the backwards shortest distance of an equivalent deterministic automaton (DFA) as a heuristic for A* search performed over a companion idempotent semiring, which is proven to return the shortest string.

UniMorph 4.0: Universal Morphology

no code implementations LREC 2022 Khuyagbaatar Batsuren, Omer Goldman, Salam Khalifa, Nizar Habash, Witold Kieraś, Gábor Bella, Brian Leonard, Garrett Nicolai, Kyle Gorman, Yustinus Ghanggo Ate, Maria Ryskina, Sabrina J. Mielke, Elena Budianskaya, Charbel El-Khaissi, Tiago Pimentel, Michael Gasser, William Lane, Mohit Raj, Matt Coler, Jaime Rafael Montoya Samame, Delio Siticonatzi Camaiteri, Benoît Sagot, Esaú Zumaeta Rojas, Didier López Francis, Arturo Oncevay, Juan López Bautista, Gema Celeste Silva Villegas, Lucas Torroba Hennigen, Adam Ek, David Guriel, Peter Dirix, Jean-Philippe Bernardy, Andrey Scherbakov, Aziyana Bayyr-ool, Antonios Anastasopoulos, Roberto Zariquiey, Karina Sheifer, Sofya Ganieva, Hilaria Cruz, Ritván Karahóǧa, Stella Markantonatou, George Pavlidis, Matvey Plugaryov, Elena Klyachko, Ali Salehi, Candy Angulo, Jatayu Baxi, Andrew Krizhanovsky, Natalia Krizhanovskaya, Elizabeth Salesky, Clara Vania, Sardana Ivanova, Jennifer White, Rowan Hall Maudslay, Josef Valvoda, Ran Zmigrod, Paula Czarnowska, Irene Nikkarinen, Aelita Salchak, Brijesh Bhatt, Christopher Straughn, Zoey Liu, Jonathan North Washington, Yuval Pinter, Duygu Ataman, Marcin Wolinski, Totok Suhardijanto, Anna Yablonskaya, Niklas Stoehr, Hossep Dolatian, Zahroh Nuriah, Shyam Ratan, Francis M. Tyers, Edoardo M. Ponti, Grant Aiton, Aryaman Arora, Richard J. Hatcher, Ritesh Kumar, Jeremiah Young, Daria Rodionova, Anastasia Yemelina, Taras Andrushko, Igor Marchenko, Polina Mashkovtseva, Alexandra Serova, Emily Prud'hommeaux, Maria Nepomniashchaya, Fausto Giunchiglia, Eleanor Chodroff, Mans Hulden, Miikka Silfverberg, Arya D. McCarthy, David Yarowsky, Ryan Cotterell, Reut Tsarfaty, Ekaterina Vylomova

The project comprises two major thrusts: a language-independent feature schema for rich morphological annotation and a type-level resource of annotated data in diverse languages realizing that schema.

Morphological Inflection

Cannot find the paper you are looking for? You can Submit a new open access paper.