Search Results for author: Winston Wu

Found 19 papers, 4 papers with code

On Pronunciations in Wiktionary: Extraction and Experiments on Multilingual Syllabification and Stress Prediction

no code implementations RANLP (BUCC) 2021 Winston Wu, David Yarowsky

We constructed parsers for five non-English editions of Wiktionary, which combined with pronunciations from the English edition, comprises over 5. 3 million IPA pronunciations, the largest pronunciation lexicon of its kind.

Evaluating Neural Model Robustness for Machine Comprehension

no code implementations EACL 2021 Winston Wu, Dustin Arendt, Svitlana Volkova

We evaluate neural model robustness to adversarial attacks using different types of linguistic unit perturbations {--} character and word, and propose a new method for strategic sentence-level perturbations.

Adversarial Attack Reading Comprehension

Wiktionary Normalization of Translations and Morphological Information

no code implementations COLING 2020 Winston Wu, David Yarowsky

We extend the Yawipa Wiktionary Parser (Wu and Yarowsky, 2020) to extract and normalize translations from etymology glosses, and morphological form-of relations, resulting in 300K unique translations and over 4 million instances of 168 annotated morphological relations.


Fine-grained Morphosyntactic Analysis and Generation Tools for More Than One Thousand Languages

no code implementations LREC 2020 Garrett Nicolai, Dylan Lewis, Arya D. McCarthy, Aaron Mueller, Winston Wu, David Yarowsky

Exploiting the broad translation of the Bible into the world{'}s languages, we train and distribute morphosyntactic tools for approximately one thousand languages, vastly outstripping previous distributions of tools devoted to the processing of inflectional morphology.


Computational Etymology and Word Emergence

no code implementations LREC 2020 Winston Wu, David Yarowsky

We developed an extensible, comprehensive Wiktionary parser that improves over several existing parsers.

Multilingual Dictionary Based Construction of Core Vocabulary

no code implementations LREC 2020 Winston Wu, Garrett Nicolai, David Yarowsky

We propose a new functional definition and construction method for core vocabulary sets for multiple applications based on the relative coverage of a target concept in thousands of bilingual dictionaries.

Cognate Prediction Machine Translation +1

An Analysis of Massively Multilingual Neural Machine Translation for Low-Resource Languages

no code implementations LREC 2020 Aaron Mueller, Garrett Nicolai, Arya D. McCarthy, Dylan Lewis, Winston Wu, David Yarowsky

We find that best practices in this domain are highly language-specific: adding more languages to a training set is often better, but too many harms performance{---}the best number depends on the source language.

Low-Resource Neural Machine Translation Translation

JHUBC's Submission to LT4HALA EvaLatin 2020

no code implementations LREC 2020 Winston Wu, Garrett Nicolai

We describe the JHUBC submission to the EvaLatin Shared task on lemmatization and part-of-speech tagging for Latin.

Lemmatization Part-Of-Speech Tagging +1

Evaluating Neural Machine Comprehension Model Robustness to Noisy Inputs and Adversarial Attacks

no code implementations1 May 2020 Winston Wu, Dustin Arendt, Svitlana Volkova

We evaluate machine comprehension models' robustness to noise and adversarial attacks by performing novel perturbations at the character, word, and sentence level.

Reading Comprehension

Modeling Color Terminology Across Thousands of Languages

1 code implementation IJCNLP 2019 Arya D. McCarthy, Winston Wu, Aaron Mueller, Bill Watson, David Yarowsky

There is an extensive history of scholarship into what constitutes a "basic" color term, as well as a broadly attested acquisition sequence of basic color terms across many languages, as articulated in the seminal work of Berlin and Kay (1969).

Cannot find the paper you are looking for? You can Submit a new open access paper.