Search Results for author: Katharina Kann

Found 63 papers, 11 papers with code

IGT2P: From Interlinear Glossed Texts to Paradigms

no code implementations EMNLP 2020 Sarah Moeller, Ling Liu, Changbing Yang, Katharina Kann, Mans Hulden

An intermediate step in the linguistic analysis of an under-documented language is to find and organize inflected forms that are attested in natural speech.

POS

Paradigm Clustering with Weighted Edit Distance

no code implementations ACL (SIGMORPHON) 2021 Andrew Gerlach, Adam Wiemerslage, Katharina Kann

This paper describes our system for the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering, which asks participants to group inflected forms together according their underlying lemma without the aid of annotated training data.

Word Embeddings

Findings of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering

no code implementations ACL (SIGMORPHON) 2021 Adam Wiemerslage, Arya D. McCarthy, Alexander Erdmann, Garrett Nicolai, Manex Agirrezabal, Miikka Silfverberg, Mans Hulden, Katharina Kann

We describe the second SIGMORPHON shared task on unsupervised morphology: the goal of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering is to cluster word types from a raw text corpus into paradigms.

Morphological Processing of Low-Resource Languages: Where We Are and What’s Next

no code implementations Findings (ACL) 2022 Adam Wiemerslage, Miikka Silfverberg, Changbing Yang, Arya McCarthy, Garrett Nicolai, Eliana Colunga, Katharina Kann

Automatic morphological processing can aid downstream natural language processing applications, especially for low-resource languages, and assist language documentation efforts for endangered languages.

Morphological Processing of Low-Resource Languages: Where We Are and What's Next

no code implementations16 Mar 2022 Adam Wiemerslage, Miikka Silfverberg, Changbing Yang, Arya D. McCarthy, Garrett Nicolai, Eliana Colunga, Katharina Kann

Automatic morphological processing can aid downstream natural language processing applications, especially for low-resource languages, and assist language documentation efforts for endangered languages.

BPE vs. Morphological Segmentation: A Case Study on Machine Translation of Four Polysynthetic Languages

no code implementations Findings (ACL) 2022 Manuel Mager, Arturo Oncevay, Elisabeth Mager, Katharina Kann, Ngoc Thang Vu

Morphologically-rich polysynthetic languages present a challenge for NLP systems due to data sparsity, and a common strategy to handle this issue is to apply subword segmentation.

Machine Translation Translation

Findings of the LoResMT 2021 Shared Task on COVID and Sign Language for Low-resource Languages

no code implementations MTSummit 2021 Atul Kr. Ojha, Chao-Hong Liu, Katharina Kann, John Ortega, Sheetal Shatam, Theodorus Fransen

Maximum system performance was computed using BLEU and follow as 36. 0 for English--Irish, 34. 6 for Irish--English, 24. 2 for English--Marathi, and 31. 3 for Marathi--English.

Machine Translation Translation

Proceedings of the First Workshop on Weakly Supervised Learning (WeaSuL)

no code implementations8 Jul 2021 Michael A. Hedderich, Benjamin Roth, Katharina Kann, Barbara Plank, Alex Ratner, Dietrich Klakow

Welcome to WeaSuL 2021, the First Workshop on Weakly Supervised Learning, co-located with ICLR 2021.

Don't Rule Out Monolingual Speakers: A Method For Crowdsourcing Machine Translation Data

no code implementations ACL 2021 Rajat Bhatnagar, Ananya Ganesh, Katharina Kann

Based on the insight that humans pay specific attention to movements, we use graphics interchange formats (GIFs) as a pivot to collect parallel sentences from monolingual annotators.

Machine Translation Translation

What Would a Teacher Do? Predicting Future Talk Moves

no code implementations Findings (ACL) 2021 Ananya Ganesh, Martha Palmer, Katharina Kann

Recent advances in natural language processing (NLP) have the ability to transform how classroom learning takes place.

Question Answering

PROST: Physical Reasoning of Objects through Space and Time

1 code implementation7 Jun 2021 Stéphane Aroca-Ouellette, Cory Paik, Alessandro Roncone, Katharina Kann

We present a new probing dataset named PROST: Physical Reasoning about Objects Through Space and Time.

Multiple-choice

How to Adapt Your Pretrained Multilingual Model to 1600 Languages

no code implementations ACL 2021 Abteen Ebrahimi, Katharina Kann

Pretrained multilingual models (PMMs) enable zero-shot learning via cross-lingual transfer, performing best for languages seen during pretraining.

Cross-Lingual Transfer NER +2

CLiMP: A Benchmark for Chinese Language Model Evaluation

no code implementations EACL 2021 Beilei Xiang, Changbing Yang, Yu Li, Alex Warstadt, Katharina Kann

CLiMP consists of sets of 1, 000 minimal pairs (MPs) for 16 syntactic contrasts in Mandarin, covering 9 major Mandarin linguistic phenomena.

Language Modelling

Acrostic Poem Generation

no code implementations EMNLP 2020 Rajat Agarwal, Katharina Kann

We propose a new task in the area of computational creativity: acrostic poem generation in English.

Language Modelling

The IMS--CUBoulder System for the SIGMORPHON 2020 Shared Task on Unsupervised Morphological Paradigm Completion

no code implementations WS 2020 Manuel Mager, Katharina Kann

In this paper, we present the systems of the University of Stuttgart IMS and the University of Colorado Boulder (IMS--CUBoulder) for SIGMORPHON 2020 Task 2 on unsupervised morphological paradigm completion (Kann et al., 2020).

Frustratingly Easy Multilingual Grapheme-to-Phoneme Conversion

no code implementations WS 2020 Nikhil Prabhu, Katharina Kann

In this paper, we describe two CU-Boulder submissions to the SIGMORPHON 2020 Task 1 on multilingual grapheme-to-phoneme conversion (G2P).

The NYU-CUBoulder Systems for SIGMORPHON 2020 Task 0 and Task 2

no code implementations WS 2020 Assaf Singer, Katharina Kann

Second, as inflected forms share most characters with the lemma, we further propose a pointer-generator transformer model to allow easy copying of input characters.

Morphological Inflection

The SIGMORPHON 2020 Shared Task on Unsupervised Morphological Paradigm Completion

no code implementations WS 2020 Katharina Kann, Arya McCarthy, Garrett Nicolai, Mans Hulden

In this paper, we describe the findings of the SIGMORPHON 2020 shared task on unsupervised morphological paradigm completion (SIGMORPHON 2020 Task 2), a novel task in the field of inflectional morphology.

Self-Training for Unsupervised Parsing with PRPN

no code implementations WS 2020 Anhad Mohananey, Katharina Kann, Samuel R. Bowman

To be able to use our model's predictions during training, we extend a recent neural UP architecture, the PRPN (Shen et al., 2018a) such that it can be trained in a semi-supervised fashion.

Language Modelling

English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too

no code implementations Asian Chapter of the Association for Computational Linguistics 2020 Jason Phang, Iacer Calixto, Phu Mon Htut, Yada Pruksachatkun, Haokun Liu, Clara Vania, Katharina Kann, Samuel R. Bowman

Intermediate-task training---fine-tuning a pretrained model on an intermediate task before fine-tuning again on the target task---often improves model performance substantially on language understanding tasks in monolingual English settings.

Question Answering Zero-Shot Cross-Lingual Transfer

The IMS-CUBoulder System for the SIGMORPHON 2020 Shared Task on Unsupervised Morphological Paradigm Completion

no code implementations25 May 2020 Manuel Mager, Katharina Kann

In this paper, we present the systems of the University of Stuttgart IMS and the University of Colorado Boulder (IMS-CUBoulder) for SIGMORPHON 2020 Task 2 on unsupervised morphological paradigm completion (Kann et al., 2020).

Weakly Supervised POS Taggers Perform Poorly on Truly Low-Resource Languages

no code implementations28 Apr 2020 Katharina Kann, Ophélie Lacroix, Anders Søgaard

Part-of-speech (POS) taggers for low-resource languages which are exclusively based on various forms of weak supervision - e. g., cross-lingual transfer, type-level supervision, or a combination thereof - have been reported to perform almost as well as supervised ones.

Cross-Lingual Transfer POS

Learning to Learn Morphological Inflection for Resource-Poor Languages

no code implementations28 Apr 2020 Katharina Kann, Samuel R. Bowman, Kyunghyun Cho

We propose to cast the task of morphological inflection - mapping a lemma to an indicated inflected form - for resource-poor languages as a meta-learning problem.

Cross-Lingual Transfer Meta-Learning +1

Neural Unsupervised Parsing Beyond English

no code implementations WS 2019 Katharina Kann, Anhad Mohananey, Samuel R. Bowman, Kyunghyun Cho

Recently, neural network models which automatically infer syntactic structure from raw text have started to achieve promising results.

Acquisition of Inflectional Morphology in Artificial Neural Networks With Prior Knowledge

no code implementations SCiL 2020 Katharina Kann

How does knowledge of one language's morphology influence learning of inflection rules in a second one?

Towards Realistic Practices In Low-Resource Natural Language Processing: The Development Set

no code implementations IJCNLP 2019 Katharina Kann, Kyunghyun Cho, Samuel R. Bowman

Here, we aim to answer the following questions: Does using a development set for early stopping in the low-resource setting influence results as compared to a more realistic alternative, where the number of training epochs is tuned on development languages?

Transductive Auxiliary Task Self-Training for Neural Multi-Task Models

no code implementations WS 2019 Johannes Bjerva, Katharina Kann, Isabelle Augenstein

Multi-task learning and self-training are two common ways to improve a machine learning model's performance in settings with limited training data.

Multi-Task Learning

Subword-Level Language Identification for Intra-Word Code-Switching

no code implementations NAACL 2019 Manuel Mager, Özlem Çetinoğlu, Katharina Kann

Language identification for code-switching (CS), the phenomenon of alternating between two or more languages in conversations, has traditionally been approached under the assumption of a single language per token.

Language Identification

Verb Argument Structure Alternations in Word and Sentence Embeddings

no code implementations WS 2019 Katharina Kann, Alex Warstadt, Adina Williams, Samuel R. Bowman

For converging evidence, we further construct LaVA, a corresponding word-level dataset, and investigate whether the same syntactic features can be extracted from word embeddings.

Frame Sentence Embedding +2

The CoNLL--SIGMORPHON 2018 Shared Task: Universal Morphological Reinflection

no code implementations CONLL 2018 Ryan Cotterell, Christo Kirov, John Sylak-Glassman, Géraldine Walther, Ekaterina Vylomova, Arya D. McCarthy, Katharina Kann, Sabrina J. Mielke, Garrett Nicolai, Miikka Silfverberg, David Yarowsky, Jason Eisner, Mans Hulden

Apart from extending the number of languages involved in earlier supervised tasks of generating inflected forms, this year the shared task also featured a new second task which asked participants to inflect words in sentential context, similar to a cloze task.

Sentence-Level Fluency Evaluation: References Help, But Can Be Spared!

no code implementations CONLL 2018 Katharina Kann, Sascha Rothe, Katja Filippova

Motivated by recent findings on the probabilistic modeling of acceptability judgments, we propose syntactic log-odds ratio (SLOR), a normalized language model score, as a metric for referenceless fluency evaluation of natural language generation output at the sentence level.

Language Modelling Text Generation

Neural Transductive Learning and Beyond: Morphological Generation in the Minimal-Resource Setting

no code implementations EMNLP 2018 Katharina Kann, Hinrich Schütze

Neural state-of-the-art sequence-to-sequence (seq2seq) models often do not perform well for small training sets.

Fortification of Neural Morphological Segmentation Models for Polysynthetic Minimal-Resource Languages

no code implementations NAACL 2018 Katharina Kann, Manuel Mager, Ivan Meza-Ruiz, Hinrich Schütze

Morphological segmentation for polysynthetic languages is challenging, because a word may consist of many individual morphemes and training data can be extremely scarce.

Cross-Lingual Transfer Data Augmentation

Unlabeled Data for Morphological Generation With Character-Based Sequence-to-Sequence Models

no code implementations WS 2017 Katharina Kann, Hinrich Schütze

We present a semi-supervised way of training a character-based encoder-decoder recurrent neural network for morphological reinflection, the task of generating one inflected word form from another.

One-Shot Neural Cross-Lingual Transfer for Paradigm Completion

no code implementations ACL 2017 Katharina Kann, Ryan Cotterell, Hinrich Schütze

We present a novel cross-lingual transfer method for paradigm completion, the task of mapping a lemma to its inflected forms, using a neural encoder-decoder model, the state of the art for the monolingual task.

Cross-Lingual Transfer One-Shot Learning

Comparative Study of CNN and RNN for Natural Language Processing

4 code implementations7 Feb 2017 Wenpeng Yin, Katharina Kann, Mo Yu, Hinrich Schütze

Deep neural networks (DNN) have revolutionized the field of natural language processing (NLP).

Neural Multi-Source Morphological Reinflection

no code implementations EACL 2017 Katharina Kann, Ryan Cotterell, Hinrich Schütze

We explore the task of multi-source morphological reinflection, which generalizes the standard, single-source version.

TAG

Single-Model Encoder-Decoder with Explicit Morphological Representation for Reinflection

1 code implementation ACL 2016 Katharina Kann, Hinrich Schütze

Morphological reinflection is the task of generating a target form given a source form, a source tag and a target tag.

TAG

Cannot find the paper you are looking for? You can Submit a new open access paper.