Search Results for author: Francis Tyers

Found 40 papers, 8 papers with code

Expanding Universal Dependencies for Polysynthetic Languages: A Case of St. Lawrence Island Yupik

no code implementations NAACL (AmericasNLP) 2021 Hyunji Park, Lane Schwartz, Francis Tyers

This paper describes the development of the first Universal Dependencies (UD) treebank for St. Lawrence Island Yupik, an endangered language spoken in the Bering Strait region.

Dependency Parsing

A finite-state morphological analyser for Paraguayan Guaraní

no code implementations NAACL (AmericasNLP) 2021 Anastasia Kuznetsova, Francis Tyers

We assess the efficacy of the approach on publicly available Wikipedia and Bible corpora and the naive coverage of analyser reaches 86% on Wikipedia and 91% on Bible corpora.

A survey of part-of-speech tagging approaches applied to K’iche’

no code implementations NAACL (AmericasNLP) 2021 Francis Tyers, Nick Howell

We study the performance of several popular neural part-of-speech taggers from the Universal Dependencies ecosystem on Mayan languages using a small corpus of 1435 annotated K’iche’ sentences consisting of approximately 10, 000 tokens, with encouraging results: F_1 scores 93%+ on lemmatisation, part-of-speech and morphological feature assignment.

Part-Of-Speech Tagging

A corpus of K’iche’ annotated for morphosyntactic structure

no code implementations NAACL (AmericasNLP) 2021 Francis Tyers, Robert Henderson

This article describes a collection of sentences in K’iche’ annotated for morphology and syntax.

Universal Dependency Treebank for Xibe

no code implementations UDW (COLING) 2020 He Zhou, Juyeon Chung, Sandra Kübler, Francis Tyers

We present our work of constructing the first treebank for the Xibe language following the Universal Dependencies (UD) annotation scheme.

Curriculum optimization for low-resource speech recognition

no code implementations17 Feb 2022 Anastasia Kuznetsova, Anurag Kumar, Jennifer Drexler Fox, Francis Tyers

Modern end-to-end speech recognition models show astonishing results in transcribing audio signals into written text.

Speech Recognition

Evaluating Multiway Multilingual NMT in the Turkic Languages

1 code implementation WMT (EMNLP) 2021 Jamshidbek Mirzakhalov, Anoop Babu, Aigiz Kunafin, Ahsan Wahab, Behzod Moydinboyev, Sardana Ivanova, Mokhiyakhon Uzokova, Shaxnoza Pulatova, Duygu Ataman, Julia Kreutzer, Francis Tyers, Orhan Firat, John Licato, Sriram Chellappan

Then, we train 26 bilingual baselines as well as a multi-way neural MT (MNMT) model using the corpus and perform an extensive analysis using automatic metrics as well as human evaluations.

Machine Translation

Do RNN States Encode Abstract Phonological Alternations?

no code implementations NAACL 2021 Miikka Silfverberg, Francis Tyers, Garrett Nicolai, Mans Hulden

Sequence-to-sequence models have delivered impressive results in word formation tasks such as morphological inflection, often learning to model subtle morphophonological details with limited training data.

Morphological Inflection

Do RNN States Encode Abstract Phonological Processes?

no code implementations1 Apr 2021 Miikka Silfverberg, Francis Tyers, Garrett Nicolai, Mans Hulden

Sequence-to-sequence models have delivered impressive results in word formation tasks such as morphological inflection, often learning to model subtle morphophonological details with limited training data.

Morphological Inflection

An Unsupervised Method for Weighting Finite-state Morphological Analyzers

2 code implementations LREC 2020 Amr Keleg, Francis Tyers, Nick Howell, Tommi Pirinen

In this paper, we have developed a method for weighting a morphological analyzer built using finite state transducers in order to disambiguate its results.

Morphological Analysis

Improving the Language Model for Low-Resource ASR with Online Text Corpora

no code implementations LREC 2020 Nils Hjortnaes, Timofey Arkhangelskiy, Niko Partanen, Michael Rie{\ss}ler, Francis Tyers

Previous experiments showed that transfer learning using DeepSpeech can improve the accuracy of a speech recognizer for Komi, though the error rate remained very high.

Automatic Speech Recognition Transfer Learning

A Finite-State Morphological Analyser for Evenki

no code implementations LREC 2020 Anna Zueva, Anastasia Kuznetsova, Francis Tyers

Since a part of the corpora belongs to texts in Evenki dialects, a version of the analyser with relaxed rules is developed for processing dialectal features.

Morphological Analysis

Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection

no code implementations LREC 2020 Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Jan Hajič, Christopher D. Manning, Sampo Pyysalo, Sebastian Schuster, Francis Tyers, Daniel Zeman

Universal Dependencies is an open community effort to create cross-linguistically consistent treebank annotation for many languages within a dependency-based lexicalist framework.

Building a Morphological Analyser for Laz

no code implementations RANLP 2019 Esra Onal, Francis Tyers

This study is an attempt to contribute to documentation and revitalization efforts of endangered Laz language, a member of South Caucasian language family mainly spoken on northeastern coastline of Turkey.

A New Annotation Scheme for the Sejong Part-of-speech Tagged Corpus

1 code implementation WS 2019 Jungyeul Park, Francis Tyers

In this paper we present a new annotation scheme for the Sejong part-of-speech tagged corpus based on Universal Dependencies style annotation.

Morphological Analysis Named Entity Recognition +1

A Report on the Third VarDial Evaluation Campaign

no code implementations WS 2019 Marcos Zampieri, Shervin Malmasi, Yves Scherrer, Tanja Samard{\v{z}}i{\'c}, Francis Tyers, Miikka Silfverberg, Natalia Klyueva, Tung-Le Pan, Chu-Ren Huang, Radu Tudor Ionescu, Andrei M. Butnaru, Tommi Jauhiainen

In this paper, we present the findings of the Third VarDial Evaluation Campaign organized as part of the sixth edition of the workshop on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects (VarDial), co-located with NAACL 2019.

14 Dialect Identification +1

Multi-source synthetic treebank creation for improved cross-lingual dependency parsing

2 code implementations WS 2018 Francis Tyers, Mariya Sheyanova, Aleks Martynova, ra, Pavel Stepachev, Konstantin Vinogorodskiy

This paper describes a method of creating synthetic treebanks for cross-lingual dependency parsing using a combination of machine translation (including pivot translation), annotation projection and the spanning tree algorithm.

Dependency Parsing Machine Translation +1

A prototype finite-state morphological analyser for Chukchi

no code implementations COLING 2018 Vasilisa Andriyanets, Francis Tyers

An error evaluation of 100 tokens randomly selected from the corpus, which were not covered by the analyser shows that most of the morphological processes are covered and that the majority of errors are caused by a limited stem lexicon.

Universal Dependencies

no code implementations CL (ACL) 2021 Joakim Nivre, Daniel Zeman, Filip Ginter, Francis Tyers

Universal Dependencies (UD) is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages.

Universal Dependencies for Turkish

no code implementations COLING 2016 Umut Sulubacak, Memduh Gokirmak, Francis Tyers, {\c{C}}a{\u{g}}r{\i} {\c{C}}{\"o}ltekin, Joakim Nivre, G{\"u}l{\c{s}}en Eryi{\u{g}}it

The Universal Dependencies (UD) project was conceived after the substantial recent interest in unifying annotation schemes across languages.

A Finite-State Morphological Analyser for Sindhi

no code implementations LREC 2016 Raveesh Motlani, Francis Tyers, Dipti Sharma

Morphological analysis is a fundamental task in natural-language processing, which is used in other NLP applications such as part-of-speech tagging, syntactic parsing, information retrieval, machine translation, etc.

Information Retrieval Machine Translation +3

A Finite-state Morphological Analyser for Tuvan

no code implementations LREC 2016 Francis Tyers, Aziyana Bayyr-ool, Aelita Salchak, Jonathan Washington

{\textasciitilde}This paper describes the development of free/open-source finite-state morphological transducers for Tuvan, a Turkic language spoken in and around the Tuvan Republic in Russia.

Finite-state morphological transducers for three Kypchak languages

no code implementations LREC 2014 Jonathan Washington, Ilnar Salimzyanov, Francis Tyers

This paper describes the development of free/open-source finite-state morphological transducers for three Turkic languages―Kazakh, Tatar, and Kumyk―representing one language from each of the three sub-branches of the Kypchak branch of Turkic.

Machine Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.