Search Results for author: Arya D. McCarthy

Found 28 papers, 6 papers with code

Findings of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering

no code implementations ACL (SIGMORPHON) 2021 Adam Wiemerslage, Arya D. McCarthy, Alexander Erdmann, Garrett Nicolai, Manex Agirrezabal, Miikka Silfverberg, Mans Hulden, Katharina Kann

We describe the second SIGMORPHON shared task on unsupervised morphology: the goal of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering is to cluster word types from a raw text corpus into paradigms.

Characterizing News Portrayal of Civil Unrest in Hong Kong, 1998–2020

no code implementations ACL (CASE) 2021 James Scharf, Arya D. McCarthy, Giovanna Maria Dora Dore

We apply statistical techniques from natural language processing to a collection of Western and Hong Kong–based English-language newspaper articles spanning the years 1998–2020, studying the difference and evolution of its portrayal.

Jump-Starting Item Parameters for Adaptive Language Tests

no code implementations EMNLP 2021 Arya D. McCarthy, Kevin P. Yancey, Geoff T. LaFlair, Jesse Egbert, Manqian Liao, Burr Settles

A challenge in designing high-stakes language assessments is calibrating the test item difficulties, either a priori or from limited pilot test data.

Language Acquisition Multi-Task Learning +1

A Mixed-Methods Analysis of Western and Hong Kong–based Reporting on the 2019–2020 Protests

no code implementations EMNLP (LaTeCHCLfL, CLFL, LaTeCH) 2021 Arya D. McCarthy, James Scharf, Giovanna Maria Dora Dore

We apply statistical techniques from natural language processing to Western and Hong Kong–based English language newspaper articles that discuss the 2019–2020 Hong Kong protests of the Anti-Extradition Law Amendment Bill Movement.

Sentiment Analysis

Measuring the Similarity of Grammatical Gender Systems by Comparing Partitions

no code implementations EMNLP 2020 Arya D. McCarthy, Adina Williams, Shijia Liu, David Yarowsky, Ryan Cotterell

Of particular interest, languages on the same branch of our phylogenetic tree are notably similar, whereas languages from separate branches are no more similar than chance.

Community Detection

AirWare: Utilizing Embedded Audio and Infrared Signals for In-Air Hand-Gesture Recognition

no code implementations25 Jan 2021 Nibhrat Lohia, Raunak Mundada, Arya D. McCarthy, Eric C. Larson

We introduce AirWare, an in-air hand-gesture recognition system that uses the already embedded speaker and microphone in most electronic devices, together with embedded infrared proximity sensors.

Hand Gesture Recognition Hand-Gesture Recognition Human-Computer Interaction

The JHU Submission to the 2020 Duolingo Shared Task on Simultaneous Translation and Paraphrase for Language Education

no code implementations WS 2020 Huda Khayrallah, Jacob Bremerman, Arya D. McCarthy, Kenton Murray, Winston Wu, Matt Post

This paper presents the Johns Hopkins University submission to the 2020 Duolingo Shared Task on Simultaneous Translation and Paraphrase for Language Education (STAPLE).

Machine Translation Translation

Addressing Posterior Collapse with Mutual Information for Improved Variational Neural Machine Translation

no code implementations ACL 2020 Arya D. McCarthy, Xi-An Li, Jiatao Gu, Ning Dong

This paper proposes a simple and effective approach to address the problem of posterior collapse in conditional variational autoencoders (CVAEs).

Machine Translation Translation

The human unlikeness of neural language models in next-word prediction

no code implementations WS 2020 Cass Jacobs, ra L., Arya D. McCarthy

The training objective of unidirectional language models (LMs) is similar to a psycholinguistic benchmark known as the cloze task, which measures next-word predictability.

Massively Multilingual Pronunciation Modeling with WikiPron

no code implementations LREC 2020 Jackson L. Lee, Lucas F.E. Ashby, M. Elizabeth Garza, Yeonju Lee-Sikka, Sean Miller, Alan Wong, Arya D. McCarthy, Kyle Gorman

We introduce WikiPron, an open-source command-line tool for extracting pronunciation data from Wiktionary, a collaborative multilingual online dictionary.

Fine-grained Morphosyntactic Analysis and Generation Tools for More Than One Thousand Languages

no code implementations LREC 2020 Garrett Nicolai, Dylan Lewis, Arya D. McCarthy, Aaron Mueller, Winston Wu, David Yarowsky

Exploiting the broad translation of the Bible into the world{'}s languages, we train and distribute morphosyntactic tools for approximately one thousand languages, vastly outstripping previous distributions of tools devoted to the processing of inflectional morphology.

Translation

An Analysis of Massively Multilingual Neural Machine Translation for Low-Resource Languages

no code implementations LREC 2020 Aaron Mueller, Garrett Nicolai, Arya D. McCarthy, Dylan Lewis, Winston Wu, David Yarowsky

We find that best practices in this domain are highly language-specific: adding more languages to a training set is often better, but too many harms performance{---}the best number depends on the source language.

Low-Resource Neural Machine Translation Translation

Predicting Declension Class from Form and Meaning

1 code implementation ACL 2020 Adina Williams, Tiago Pimentel, Arya D. McCarthy, Hagen Blix, Eleanor Chodroff, Ryan Cotterell

We find for two Indo-European languages (Czech and German) that form and meaning respectively share significant amounts of information with class (and contribute additional information above and beyond gender).

SkinAugment: Auto-Encoding Speaker Conversions for Automatic Speech Translation

1 code implementation27 Feb 2020 Arya D. McCarthy, Liezl Puzon, Juan Pino

Our method compares favorably to SpecAugment on English$\to$French and English$\to$Romanian automatic speech translation (AST) tasks as well as on a low-resource English automatic speech recognition (ASR) task.

Data Augmentation Speech Recognition +1

The SIGMORPHON 2019 Shared Task: Morphological Analysis in Context and Cross-Lingual Transfer for Inflection

no code implementations WS 2019 Arya D. McCarthy, Ekaterina Vylomova, Shijie Wu, Chaitanya Malaviya, Lawrence Wolf-Sonkin, Garrett Nicolai, Christo Kirov, Miikka Silfverberg, Sabrina J. Mielke, Jeffrey Heinz, Ryan Cotterell, Mans Hulden

The SIGMORPHON 2019 shared task on cross-lingual transfer and contextual analysis in morphology examined transfer learning of inflection between 100 language pairs, as well as contextual lemmatization and morphosyntactic description in 66 languages.

Cross-Lingual Transfer Lemmatization +3

Modeling Color Terminology Across Thousands of Languages

1 code implementation IJCNLP 2019 Arya D. McCarthy, Winston Wu, Aaron Mueller, Bill Watson, David Yarowsky

There is an extensive history of scholarship into what constitutes a "basic" color term, as well as a broadly attested acquisition sequence of basic color terms across many languages, as articulated in the seminal work of Berlin and Kay (1969).

Improved Variational Neural Machine Translation by Promoting Mutual Information

no code implementations19 Sep 2019 Arya D. McCarthy, Xi-An Li, Jiatao Gu, Ning Dong

Posterior collapse plagues VAEs for text, especially for conditional text generation with strong autoregressive decoders.

Conditional Text Generation Machine Translation +1

Meaning to Form: Measuring Systematicity as Information

1 code implementation ACL 2019 Tiago Pimentel, Arya D. McCarthy, Damián E. Blasi, Brian Roark, Ryan Cotterell

A longstanding debate in semiotics centers on the relationship between linguistic signs and their corresponding semantics: is there an arbitrary relationship between a word form and its meaning, or does some systematic phenomenon pervade?

The CoNLL--SIGMORPHON 2018 Shared Task: Universal Morphological Reinflection

no code implementations CONLL 2018 Ryan Cotterell, Christo Kirov, John Sylak-Glassman, Géraldine Walther, Ekaterina Vylomova, Arya D. McCarthy, Katharina Kann, Sabrina J. Mielke, Garrett Nicolai, Miikka Silfverberg, David Yarowsky, Jason Eisner, Mans Hulden

Apart from extending the number of languages involved in earlier supervised tasks of generating inflected forms, this year the shared task also featured a new second task which asked participants to inflect words in sentential context, similar to a cloze task.

Marrying Universal Dependencies and Universal Morphology

no code implementations WS 2018 Arya D. McCarthy, Miikka Silfverberg, Ryan Cotterell, Mans Hulden, David Yarowsky

The Universal Dependencies (UD) and Universal Morphology (UniMorph) projects each present schemata for annotating the morphosyntactic details of language.

Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation

1 code implementation WS 2018 Brian Thompson, Huda Khayrallah, Antonios Anastasopoulos, Arya D. McCarthy, Kevin Duh, Rebecca Marvin, Paul McNamee, Jeremy Gwinnup, Tim Anderson, Philipp Koehn

To better understand the effectiveness of continued training, we analyze the major components of a neural machine translation system (the encoder, decoder, and each embedding space) and consider each component's contribution to, and capacity for, domain adaptation.

Domain Adaptation Machine Translation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.