Search Results for author: Miikka Silfverberg

Found 45 papers, 5 papers with code

One Wug, Two Wug+s Transformer Inflection Models Hallucinate Affixes

no code implementations ComputEL (ACL) 2022 Farhan Samir, Miikka Silfverberg

Data augmentation strategies are increasingly important in NLP pipelines for low-resourced and endangered languages, and in neural morphological inflection, augmentation by so called data hallucination is a popular technique.

Data Augmentation Morphological Inflection

Morphological Processing of Low-Resource Languages: Where We Are and What’s Next

no code implementations Findings (ACL) 2022 Adam Wiemerslage, Miikka Silfverberg, Changbing Yang, Arya McCarthy, Garrett Nicolai, Eliana Colunga, Katharina Kann

Automatic morphological processing can aid downstream natural language processing applications, especially for low-resource languages, and assist language documentation efforts for endangered languages.

Unsupervised Paradigm Clustering Using Transformation Rules

no code implementations ACL (SIGMORPHON) 2021 Changbing Yang, Garrett Nicolai, Miikka Silfverberg

Secondly, we experiment with more general rules which can apply transformations inside the input strings in addition to prefix and suffix transformations.

Findings of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering

no code implementations ACL (SIGMORPHON) 2021 Adam Wiemerslage, Arya D. McCarthy, Alexander Erdmann, Garrett Nicolai, Manex Agirrezabal, Miikka Silfverberg, Mans Hulden, Katharina Kann

We describe the second SIGMORPHON shared task on unsupervised morphology: the goal of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering is to cluster word types from a raw text corpus into paradigms.

14

Ensembles of Neural Morphological Inflection Models

no code implementations WS (NoDaLiDa) 2019 Ilmari Kylliäinen, Miikka Silfverberg

We investigate different ensemble learning techniques for neural morphological inflection using bidirectional LSTM encoder-decoder models with attention.

Ensemble Learning Morphological Inflection

UniMorph 4.0: Universal Morphology

no code implementations7 May 2022 Khuyagbaatar Batsuren, Omer Goldman, Salam Khalifa, Nizar Habash, Witold Kieraś, Gábor Bella, Brian Leonard, Garrett Nicolai, Kyle Gorman, Yustinus Ghanggo Ate, Maria Ryskina, Sabrina J. Mielke, Elena Budianskaya, Charbel El-Khaissi, Tiago Pimentel, Michael Gasser, William Lane, Mohit Raj, Matt Coler, Jaime Rafael Montoya Samame, Delio Siticonatzi Camaiteri, Esaú Zumaeta Rojas, Didier López Francis, Arturo Oncevay, Juan López Bautista, Gema Celeste Silva Villegas, Lucas Torroba Hennigen, Adam Ek, David Guriel, Peter Dirix, Jean-Philippe Bernardy, Andrey Scherbakov, Aziyana Bayyr-ool, Antonios Anastasopoulos, Roberto Zariquiey, Karina Sheifer, Sofya Ganieva, Hilaria Cruz, Ritván Karahóǧa, Stella Markantonatou, George Pavlidis, Matvey Plugaryov, Elena Klyachko, Ali Salehi, Candy Angulo, Jatayu Baxi, Andrew Krizhanovsky, Natalia Krizhanovskaya, Elizabeth Salesky, Clara Vania, Sardana Ivanova, Jennifer White, Rowan Hall Maudslay, Josef Valvoda, Ran Zmigrod, Paula Czarnowska, Irene Nikkarinen, Aelita Salchak, Brijesh Bhatt, Christopher Straughn, Zoey Liu, Jonathan North Washington, Yuval Pinter, Duygu Ataman, Marcin Wolinski, Totok Suhardijanto, Anna Yablonskaya, Niklas Stoehr, Hossep Dolatian, Zahroh Nuriah, Shyam Ratan, Francis M. Tyers, Edoardo M. Ponti, Grant Aiton, Aryaman Arora, Richard J. Hatcher, Ritesh Kumar, Jeremiah Young, Daria Rodionova, Anastasia Yemelina, Taras Andrushko, Igor Marchenko, Polina Mashkovtseva, Alexandra Serova, Emily Prud'hommeaux, Maria Nepomniashchaya, Fausto Giunchiglia, Eleanor Chodroff, Mans Hulden, Miikka Silfverberg, Arya D. McCarthy, David Yarowsky, Ryan Cotterell, Reut Tsarfaty, Ekaterina Vylomova

The project comprises two major thrusts: a language-independent feature schema for rich morphological annotation and a type-level resource of annotated data in diverse languages realizing that schema.

Morphological Inflection

Dim Wihl Gat Tun: The Case for Linguistic Expertise in NLP for Underdocumented Languages

no code implementations17 Mar 2022 Clarissa Forbes, Farhan Samir, Bruce Harold Oliver, Changbing Yang, Edith Coates, Garrett Nicolai, Miikka Silfverberg

With this paper, we make the case that IGT data can be leveraged successfully provided that target language expertise is available.

Morphological Processing of Low-Resource Languages: Where We Are and What's Next

no code implementations16 Mar 2022 Adam Wiemerslage, Miikka Silfverberg, Changbing Yang, Arya D. McCarthy, Garrett Nicolai, Eliana Colunga, Katharina Kann

Automatic morphological processing can aid downstream natural language processing applications, especially for low-resource languages, and assist language documentation efforts for endangered languages.

Do RNN States Encode Abstract Phonological Alternations?

no code implementations NAACL 2021 Miikka Silfverberg, Francis Tyers, Garrett Nicolai, Mans Hulden

Sequence-to-sequence models have delivered impressive results in word formation tasks such as morphological inflection, often learning to model subtle morphophonological details with limited training data.

Morphological Inflection

Do RNN States Encode Abstract Phonological Processes?

no code implementations1 Apr 2021 Miikka Silfverberg, Francis Tyers, Garrett Nicolai, Mans Hulden

Sequence-to-sequence models have delivered impressive results in word formation tasks such as morphological inflection, often learning to model subtle morphophonological details with limited training data.

Morphological Inflection

Translating the Unseen? Yoruba-English MT in Low-Resource, Morphologically-Unmarked Settings

1 code implementation7 Mar 2021 Ife Adebara, Muhammad Abdul-Mageed, Miikka Silfverberg

In this work, we perform fine-grained analysis on how an SMT system compares with two NMT systems (BiLSTM and Transformer) when translating bare nouns in Yor\`ub\'a into English.

Machine Translation Translation

Noise Isn't Always Negative: Countering Exposure Bias in Sequence-to-Sequence Inflection Models

no code implementations COLING 2020 Garrett Nicolai, Miikka Silfverberg

Morphological inflection, like many sequence-to-sequence tasks, sees great performance from recurrent neural architectures when data is plentiful, but performance falls off sharply in lower-data settings.

Morphological Inflection

The SIGMORPHON 2019 Shared Task: Morphological Analysis in Context and Cross-Lingual Transfer for Inflection

no code implementations WS 2019 Arya D. McCarthy, Ekaterina Vylomova, Shijie Wu, Chaitanya Malaviya, Lawrence Wolf-Sonkin, Garrett Nicolai, Christo Kirov, Miikka Silfverberg, Sabrina J. Mielke, Jeffrey Heinz, Ryan Cotterell, Mans Hulden

The SIGMORPHON 2019 shared task on cross-lingual transfer and contextual analysis in morphology examined transfer learning of inflection between 100 language pairs, as well as contextual lemmatization and morphosyntactic description in 66 languages.

Cross-Lingual Transfer Lemmatization +3

A Finnish News Corpus for Named Entity Recognition

2 code implementations12 Aug 2019 Teemu Ruokolainen, Pekka Kauppinen, Miikka Silfverberg, Krister Lindén

We present a corpus of Finnish news articles with a manually prepared named entity annotation.

Named Entity Recognition

A Report on the Third VarDial Evaluation Campaign

no code implementations WS 2019 Marcos Zampieri, Shervin Malmasi, Yves Scherrer, Tanja Samard{\v{z}}i{\'c}, Francis Tyers, Miikka Silfverberg, Natalia Klyueva, Tung-Le Pan, Chu-Ren Huang, Radu Tudor Ionescu, Andrei M. Butnaru, Tommi Jauhiainen

In this paper, we present the findings of the Third VarDial Evaluation Campaign organized as part of the sixth edition of the workshop on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects (VarDial), co-located with NAACL 2019.

14 Dialect Identification +1

The CoNLL--SIGMORPHON 2018 Shared Task: Universal Morphological Reinflection

no code implementations CONLL 2018 Ryan Cotterell, Christo Kirov, John Sylak-Glassman, Géraldine Walther, Ekaterina Vylomova, Arya D. McCarthy, Katharina Kann, Sabrina J. Mielke, Garrett Nicolai, Miikka Silfverberg, David Yarowsky, Jason Eisner, Mans Hulden

Apart from extending the number of languages involved in earlier supervised tasks of generating inflected forms, this year the shared task also featured a new second task which asked participants to inflect words in sentential context, similar to a cloze task.

Marrying Universal Dependencies and Universal Morphology

no code implementations WS 2018 Arya D. McCarthy, Miikka Silfverberg, Ryan Cotterell, Mans Hulden, David Yarowsky

The Universal Dependencies (UD) and Universal Morphology (UniMorph) projects each present schemata for annotating the morphosyntactic details of language.

A Computational Model for the Linguistic Notion of Morphological Paradigm

no code implementations COLING 2018 Miikka Silfverberg, Ling Liu, Mans Hulden

In supervised learning of morphological patterns, the strategy of generalizing inflectional tables into more abstract paradigms through alignment of the longest common subsequence found in an inflection table has been proposed as an efficient method to deduce the inflectional behavior of unseen word forms.

Sub-label dependencies for Neural Morphological Tagging -- The Joint Submission of University of Colorado and University of Helsinki for VarDial 2018

no code implementations COLING 2018 Miikka Silfverberg, Senka Drobac

This paper presents the submission of the UH{\&}CU team (Joint University of Colorado and University of Helsinki team) for the VarDial 2018 shared task on morphosyntactic tagging of Croatian, Slovenian and Serbian tweets.

Morphological Tagging Word Embeddings

Weakly supervised learning of allomorphy

no code implementations WS 2017 Miikka Silfverberg, Mans Hulden

Most NLP resources that offer annotations at the word segment level provide morphological annotation that includes features indicating tense, aspect, modality, gender, case, and other inflectional information.

Heuristic Hyper-minimization of Finite State Lexicons

no code implementations LREC 2014 Senka Drobac, Krister Lind{\'e}n, Tommi Pirinen, Miikka Silfverberg

The most noticeable reduction in size we got with a morphological transducer for Greenlandic, whose original size is on average about 15 times larger than other morphologies.

Cannot find the paper you are looking for? You can Submit a new open access paper.