no code implementations • ACL (IWSLT) 2021 • Pavel Denisov, Manuel Mager, Ngoc Thang Vu
This paper describes the submission to the IWSLT 2021 Low-Resource Speech Translation Shared Task by IMS team.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
no code implementations • NAACL (AmericasNLP) 2021 • Manuel Mager, Arturo Oncevay, Abteen Ebrahimi, John Ortega, Annette Rios, Angela Fan, Ximena Gutierrez-Vasques, Luis Chiruzzo, Gustavo Giménez-Lugo, Ricardo Ramos, Ivan Vladimir Meza Ruiz, Rolando Coto-Solano, Alexis Palmer, Elisabeth Mager-Hois, Vishrav Chaudhary, Graham Neubig, Ngoc Thang Vu, Katharina Kann
This paper presents the results of the 2021 Shared Task on Open Machine Translation for Indigenous Languages of the Americas.
no code implementations • 11 Jun 2023 • Manuel Mager, Rajat Bhatnagar, Graham Neubig, Ngoc Thang Vu, Katharina Kann
Neural models have drastically advanced state of the art for machine translation (MT) between high-resource languages.
no code implementations • 31 May 2023 • Manuel Mager, Elisabeth Mager, Katharina Kann, Ngoc Thang Vu
In recent years machine translation has become very successful for high-resource language pairs.
no code implementations • 11 Oct 2022 • Marwa Gaser, Manuel Mager, Injy Hamed, Nizar Habash, Slim Abdennadher, Ngoc Thang Vu
For extreme low-resource scenarios, a combination of frequency and morphology-based segmentations is shown to perform the best.
no code implementations • Findings (ACL) 2022 • Manuel Mager, Arturo Oncevay, Elisabeth Mager, Katharina Kann, Ngoc Thang Vu
Morphologically-rich polysynthetic languages present a challenge for NLP systems due to data sparsity, and a common strategy to handle this issue is to apply subword segmentation.
no code implementations • 30 Jun 2021 • Pavel Denisov, Manuel Mager, Ngoc Thang Vu
This paper describes the submission to the IWSLT 2021 Low-Resource Speech Translation Shared Task by IMS team.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
1 code implementation • ACL 2022 • Abteen Ebrahimi, Manuel Mager, Arturo Oncevay, Vishrav Chaudhary, Luis Chiruzzo, Angela Fan, John Ortega, Ricardo Ramos, Annette Rios, Ivan Meza-Ruiz, Gustavo A. Giménez-Lugo, Elisabeth Mager, Graham Neubig, Alexis Palmer, Rolando Coto-Solano, Ngoc Thang Vu, Katharina Kann
Continued pretraining offers improvements, with an average accuracy of 44. 05%.
no code implementations • EMNLP 2020 • Manuel Mager, Özlem Çetinoğlu, Katharina Kann
Canonical morphological segmentation consists of dividing words into their standardized morphemes.
no code implementations • WS 2020 • Manuel Mager, Katharina Kann
In this paper, we present the systems of the University of Stuttgart IMS and the University of Colorado Boulder (IMS--CUBoulder) for SIGMORPHON 2020 Task 2 on unsupervised morphological paradigm completion (Kann et al., 2020).
no code implementations • 25 May 2020 • Manuel Mager, Katharina Kann
In this paper, we present the systems of the University of Stuttgart IMS and the University of Colorado Boulder (IMS-CUBoulder) for SIGMORPHON 2020 Task 2 on unsupervised morphological paradigm completion (Kann et al., 2020).
1 code implementation • ACL 2020 • Manuel Mager, Ramon Fernandez Astudillo, Tahira Naseem, Md. Arafat Sultan, Young-suk Lee, Radu Florian, Salim Roukos
Meaning Representations (AMRs) are broad-coverage sentence-level semantic graphs.
Ranked #10 on AMR-to-Text Generation on LDC2017T10
no code implementations • NAACL 2019 • Manuel Mager, Özlem Çetinoğlu, Katharina Kann
Language identification for code-switching (CS), the phenomenon of alternating between two or more languages in conversations, has traditionally been approached under the assumption of a single language per token.
no code implementations • COLING 2018 • Manuel Mager, Elisabeth Mager, Alfonso Medina-Urrea, Ivan Meza, Katharina Kann
Machine translation from polysynthetic to fusional languages is a challenging task, which gets further complicated by the limited amount of parallel text available.
1 code implementation • COLING 2018 • Manuel Mager, Ximena Gutierrez-Vasques, Gerardo Sierra, Ivan Meza
Indigenous languages of the American continent are highly diverse.
no code implementations • NAACL 2018 • Katharina Kann, Manuel Mager, Ivan Meza-Ruiz, Hinrich Schütze
Morphological segmentation for polysynthetic languages is challenging, because a word may consist of many individual morphemes and training data can be extremely scarce.