Search Results for author: Mans Hulden

Found 69 papers, 8 papers with code

Detecting Annotation Errors in Morphological Data with the Transformer

no code implementations ACL 2022 Ling Liu, Mans Hulden

Annotation errors that stem from various sources are usually unavoidable when performing large-scale annotation of linguistic data.

Findings of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering

no code implementations ACL (SIGMORPHON) 2021 Adam Wiemerslage, Arya D. McCarthy, Alexander Erdmann, Garrett Nicolai, Manex Agirrezabal, Miikka Silfverberg, Mans Hulden, Katharina Kann

We describe the second SIGMORPHON shared task on unsupervised morphology: the goal of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering is to cluster word types from a raw text corpus into paradigms.

Clustering

IGT2P: From Interlinear Glossed Texts to Paradigms

no code implementations EMNLP 2020 Sarah Moeller, Ling Liu, Changbing Yang, Katharina Kann, Mans Hulden

An intermediate step in the linguistic analysis of an under-documented language is to find and organize inflected forms that are attested in natural speech.

POS

My Case, For an Adposition: Lexical Polysemy of Adpositions and Case Markers in Finnish and Latin

no code implementations LREC 2022 Daniel Chen, Mans Hulden

Adpositions and case markers contain a high degree of polysemy and participate in unique semantic role configurations.

Clustering

Backtranslation in Neural Morphological Inflection

no code implementations EMNLP (insights) 2021 Ling Liu, Mans Hulden

Backtranslation is a common technique for leveraging unlabeled data in low-resource scenarios in machine translation.

Machine Translation Morphological Inflection +1

Eeny, meeny, miny, moe. How to choose data for morphological inflection

1 code implementation26 Oct 2022 Saliha Muradoglu, Mans Hulden

In this paper, we explore four sampling strategies for the task of morphological inflection using a Transformer model: a pair of oracle experiments where data is chosen based on whether the model already can or cannot inflect the test forms correctly, as well as strategies based on high/low model confidence, entropy, as well as random selection.

Active Learning Morphological Inflection

UniMorph 4.0: Universal Morphology

no code implementations LREC 2022 Khuyagbaatar Batsuren, Omer Goldman, Salam Khalifa, Nizar Habash, Witold Kieraś, Gábor Bella, Brian Leonard, Garrett Nicolai, Kyle Gorman, Yustinus Ghanggo Ate, Maria Ryskina, Sabrina J. Mielke, Elena Budianskaya, Charbel El-Khaissi, Tiago Pimentel, Michael Gasser, William Lane, Mohit Raj, Matt Coler, Jaime Rafael Montoya Samame, Delio Siticonatzi Camaiteri, Benoît Sagot, Esaú Zumaeta Rojas, Didier López Francis, Arturo Oncevay, Juan López Bautista, Gema Celeste Silva Villegas, Lucas Torroba Hennigen, Adam Ek, David Guriel, Peter Dirix, Jean-Philippe Bernardy, Andrey Scherbakov, Aziyana Bayyr-ool, Antonios Anastasopoulos, Roberto Zariquiey, Karina Sheifer, Sofya Ganieva, Hilaria Cruz, Ritván Karahóǧa, Stella Markantonatou, George Pavlidis, Matvey Plugaryov, Elena Klyachko, Ali Salehi, Candy Angulo, Jatayu Baxi, Andrew Krizhanovsky, Natalia Krizhanovskaya, Elizabeth Salesky, Clara Vania, Sardana Ivanova, Jennifer White, Rowan Hall Maudslay, Josef Valvoda, Ran Zmigrod, Paula Czarnowska, Irene Nikkarinen, Aelita Salchak, Brijesh Bhatt, Christopher Straughn, Zoey Liu, Jonathan North Washington, Yuval Pinter, Duygu Ataman, Marcin Wolinski, Totok Suhardijanto, Anna Yablonskaya, Niklas Stoehr, Hossep Dolatian, Zahroh Nuriah, Shyam Ratan, Francis M. Tyers, Edoardo M. Ponti, Grant Aiton, Aryaman Arora, Richard J. Hatcher, Ritesh Kumar, Jeremiah Young, Daria Rodionova, Anastasia Yemelina, Taras Andrushko, Igor Marchenko, Polina Mashkovtseva, Alexandra Serova, Emily Prud'hommeaux, Maria Nepomniashchaya, Fausto Giunchiglia, Eleanor Chodroff, Mans Hulden, Miikka Silfverberg, Arya D. McCarthy, David Yarowsky, Ryan Cotterell, Reut Tsarfaty, Ekaterina Vylomova

The project comprises two major thrusts: a language-independent feature schema for rich morphological annotation and a type-level resource of annotated data in diverse languages realizing that schema.

Morphological Inflection

To POS Tag or Not to POS Tag: The Impact of POS Tags on Morphological Learning in Low-Resource Settings

no code implementations ACL 2021 Sarah Moeller, Ling Liu, Mans Hulden

However, the importance and usefulness of POS tags needs to be examined as NLP expands to low-resource languages because linguists who provide many annotated resources do not place priority on early identification and tagging of POS.

POS TAG

Do RNN States Encode Abstract Phonological Alternations?

no code implementations NAACL 2021 Miikka Silfverberg, Francis Tyers, Garrett Nicolai, Mans Hulden

Sequence-to-sequence models have delivered impressive results in word formation tasks such as morphological inflection, often learning to model subtle morphophonological details with limited training data.

Memorization Morphological Inflection

Do RNN States Encode Abstract Phonological Processes?

no code implementations1 Apr 2021 Miikka Silfverberg, Francis Tyers, Garrett Nicolai, Mans Hulden

Sequence-to-sequence models have delivered impressive results in word formation tasks such as morphological inflection, often learning to model subtle morphophonological details with limited training data.

Memorization Morphological Inflection

Analogy Models for Neural Word Inflection

1 code implementation COLING 2020 Ling Liu, Mans Hulden

Analogy is assumed to be the cognitive mechanism speakers resort to in order to inflect an unknown form of a lexeme based on knowledge of other words in a language.

Hallucination LEMMA

Data Augmentation for Transformer-based G2P

no code implementations WS 2020 Zach Ryan, Mans Hulden

The Transformer model has been shown to outperform other neural seq2seq models in several character-level tasks.

Data Augmentation

Leveraging Principal Parts for Morphological Inflection

no code implementations WS 2020 Ling Liu, Mans Hulden

This paper presents the submission by the CU Ling team from the University of Colorado to SIGMORPHON 2020 shared task 0 on morphological inflection.

LEMMA Morphological Inflection

The SIGMORPHON 2020 Shared Task on Unsupervised Morphological Paradigm Completion

no code implementations WS 2020 Katharina Kann, Arya McCarthy, Garrett Nicolai, Mans Hulden

In this paper, we describe the findings of the SIGMORPHON 2020 shared task on unsupervised morphological paradigm completion (SIGMORPHON 2020 Task 2), a novel task in the field of inflectional morphology.

LEMMA Task 2

Applying the Transformer to Character-level Transduction

2 code implementations EACL 2021 Shijie Wu, Ryan Cotterell, Mans Hulden

The transformer has been shown to outperform recurrent neural network-based sequence-to-sequence models in various word-level NLP tasks.

Morphological Inflection Transliteration

The SIGMORPHON 2019 Shared Task: Morphological Analysis in Context and Cross-Lingual Transfer for Inflection

no code implementations WS 2019 Arya D. McCarthy, Ekaterina Vylomova, Shijie Wu, Chaitanya Malaviya, Lawrence Wolf-Sonkin, Garrett Nicolai, Christo Kirov, Miikka Silfverberg, Sabrina J. Mielke, Jeffrey Heinz, Ryan Cotterell, Mans Hulden

The SIGMORPHON 2019 shared task on cross-lingual transfer and contextual analysis in morphology examined transfer learning of inflection between 100 language pairs, as well as contextual lemmatization and morphosyntactic description in 66 languages.

Cross-Lingual Transfer Lemmatization +3

The CoNLL--SIGMORPHON 2018 Shared Task: Universal Morphological Reinflection

no code implementations CONLL 2018 Ryan Cotterell, Christo Kirov, John Sylak-Glassman, Géraldine Walther, Ekaterina Vylomova, Arya D. McCarthy, Katharina Kann, Sabrina J. Mielke, Garrett Nicolai, Miikka Silfverberg, David Yarowsky, Jason Eisner, Mans Hulden

Apart from extending the number of languages involved in earlier supervised tasks of generating inflected forms, this year the shared task also featured a new second task which asked participants to inflect words in sentential context, similar to a cloze task.

LEMMA Task 2

Marrying Universal Dependencies and Universal Morphology

no code implementations WS 2018 Arya D. McCarthy, Miikka Silfverberg, Ryan Cotterell, Mans Hulden, David Yarowsky

The Universal Dependencies (UD) and Universal Morphology (UniMorph) projects each present schemata for annotating the morphosyntactic details of language.

Automatic Glossing in a Low-Resource Setting for Language Documentation

no code implementations COLING 2018 Sarah Moeller, Mans Hulden

Morphological analysis of morphologically rich and low-resource languages is important to both descriptive linguistics and natural language processing.

Descriptive Morphological Analysis

A Computational Model for the Linguistic Notion of Morphological Paradigm

no code implementations COLING 2018 Miikka Silfverberg, Ling Liu, Mans Hulden

In supervised learning of morphological patterns, the strategy of generalizing inflectional tables into more abstract paradigms through alignment of the longest common subsequence found in an inflection table has been proposed as an efficient method to deduce the inflectional behavior of unseen word forms.

A Neural Morphological Analyzer for Arapaho Verbs Learned from a Finite State Transducer

no code implementations COLING 2018 Sarah Moeller, Ghazaleh Kazeminejad, Andrew Cowell, Mans Hulden

We experiment with training an encoder-decoder neural model for mimicking the behavior of an existing hand-written finite-state morphological grammar for Arapaho verbs, a polysynthetic language with a highly complex verbal inflection system.

Machine Translation Morphological Analysis +1

The Computational Complexity of Distinctive Feature Minimization in Phonology

no code implementations NAACL 2018 Hubie Chen, Mans Hulden

We find that the natural class decision problem is tractable (i. e. is in P), while the minimization problem is not; the decision version of the problem which determines whether a natural class can be defined with $k$ features or less is NP-complete.

On the Diachronic Stability of Irregularity in Inflectional Morphology

no code implementations23 Apr 2018 Ryan Cotterell, Christo Kirov, Mans Hulden, Jason Eisner

Many languages' inflectional morphological systems are replete with irregulars, i. e., words that do not seem to follow standard inflectional rules.

Relation

A Comparison of Feature-Based and Neural Scansion of Poetry

no code implementations RANLP 2017 Manex Agirrezabal, Iñaki Alegria, Mans Hulden

Automatic analysis of poetic rhythm is a challenging task that involves linguistics, literature, and computer science.

Weakly supervised learning of allomorphy

no code implementations WS 2017 Miikka Silfverberg, Mans Hulden

Most NLP resources that offer annotations at the word segment level provide morphological annotation that includes features indicating tense, aspect, modality, gender, case, and other inflectional information.

Weakly-supervised Learning

A phoneme clustering algorithm based on the obligatory contour principle

1 code implementation CONLL 2017 Mans Hulden

This paper explores a divisive hierarchical clustering algorithm based on the well-known Obligatory Contour Principle in phonology.

Clustering

How Regular is Japanese Loanword Adaptation? A Computational Study

no code implementations COLING 2016 Lingshuang Mao, Mans Hulden

The modifications that foreign loanwords undergo when adapted into Japanese have been the subject of much study in linguistics.

Evaluating the Noisy Channel Model for the Normalization of Historical Texts: Basque, Spanish and Slovene

no code implementations LREC 2016 Izaskun Etxeberria, I{\~n}aki Alegria, Larraitz Uria, Mans Hulden

This paper presents a method for the normalization of historical texts using a combination of weighted finite-state transducers and language models.

Morphological Analysis of Sahidic Coptic for Automatic Glossing

no code implementations LREC 2016 Daniel Smith, Mans Hulden

We report on the implementation of a morphological analyzer for the Sahidic dialect of Coptic, a now extinct Afro-Asiatic language.

Morphological Analysis Transliteration

Deriving Morphological Analyzers from Example Inflections

no code implementations LREC 2016 Markus Forsberg, Mans Hulden

This paper presents a semi-automatic method to derive morphological analyzers from a limited number of example inflections suitable for languages with alphabetic writing systems.

Computer-aided morphology expansion for Old Swedish

no code implementations LREC 2014 Yvonne Adesam, Malin Ahlberg, Peter Andersson, Gerlof Bouma, Markus Forsberg, Mans Hulden

In this paper we describe and evaluate a tool for paradigm induction and lexicon extraction that has been applied to Old Swedish.

Boosting statistical tagger accuracy with simple rule-based grammars

no code implementations LREC 2012 Mans Hulden, Jerid Francom

We report on several experiments on combining a rule-based tagger and a trigram tagger for Spanish.

Part-Of-Speech Tagging

Cannot find the paper you are looking for? You can Submit a new open access paper.