Search Results for author: Francis Tyers

Found 46 papers, 10 papers with code

Finite-state morphological transducers for three Kypchak languages

no code implementations LREC 2014 Jonathan Washington, Ilnar Salimzyanov, Francis Tyers

This paper describes the development of free/open-source finite-state morphological transducers for three Turkic languages―Kazakh, Tatar, and Kumyk―representing one language from each of the three sub-branches of the Kypchak branch of Turkic.

Machine Translation

A Finite-State Morphological Analyser for Sindhi

no code implementations LREC 2016 Raveesh Motlani, Francis Tyers, Dipti Sharma

Morphological analysis is a fundamental task in natural-language processing, which is used in other NLP applications such as part-of-speech tagging, syntactic parsing, information retrieval, machine translation, etc.

Information Retrieval LEMMA +5

A Finite-state Morphological Analyser for Tuvan

no code implementations LREC 2016 Francis Tyers, Aziyana Bayyr-ool, Aelita Salchak, Jonathan Washington

{\textasciitilde}This paper describes the development of free/open-source finite-state morphological transducers for Tuvan, a Turkic language spoken in and around the Tuvan Republic in Russia.

Universal Dependencies for Turkish

no code implementations COLING 2016 Umut Sulubacak, Memduh Gokirmak, Francis Tyers, {\c{C}}a{\u{g}}r{\i} {\c{C}}{\"o}ltekin, Joakim Nivre, G{\"u}l{\c{s}}en Eryi{\u{g}}it

The Universal Dependencies (UD) project was conceived after the substantial recent interest in unifying annotation schemes across languages.

Universal Dependencies

no code implementations CL (ACL) 2021 Joakim Nivre, Daniel Zeman, Filip Ginter, Francis Tyers

Universal Dependencies (UD) is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages.

A prototype finite-state morphological analyser for Chukchi

no code implementations COLING 2018 Vasilisa Andriyanets, Francis Tyers

An error evaluation of 100 tokens randomly selected from the corpus, which were not covered by the analyser shows that most of the morphological processes are covered and that the majority of errors are caused by a limited stem lexicon.

Multi-source synthetic treebank creation for improved cross-lingual dependency parsing

2 code implementations WS 2018 Francis Tyers, Mariya Sheyanova, Aleks Martynova, ra, Pavel Stepachev, Konstantin Vinogorodskiy

This paper describes a method of creating synthetic treebanks for cross-lingual dependency parsing using a combination of machine translation (including pivot translation), annotation projection and the spanning tree algorithm.

Dependency Parsing Machine Translation +2

A Report on the Third VarDial Evaluation Campaign

no code implementations WS 2019 Marcos Zampieri, Shervin Malmasi, Yves Scherrer, Tanja Samard{\v{z}}i{\'c}, Francis Tyers, Miikka Silfverberg, Natalia Klyueva, Tung-Le Pan, Chu-Ren Huang, Radu Tudor Ionescu, Andrei M. Butnaru, Tommi Jauhiainen

In this paper, we present the findings of the Third VarDial Evaluation Campaign organized as part of the sixth edition of the workshop on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects (VarDial), co-located with NAACL 2019.

Dialect Identification Morphological Analysis

A New Annotation Scheme for the Sejong Part-of-speech Tagged Corpus

1 code implementation WS 2019 Jungyeul Park, Francis Tyers

In this paper we present a new annotation scheme for the Sejong part-of-speech tagged corpus based on Universal Dependencies style annotation.

Morphological Analysis named-entity-recognition +3

Building a Morphological Analyser for Laz

no code implementations RANLP 2019 Esra Onal, Francis Tyers

This study is an attempt to contribute to documentation and revitalization efforts of endangered Laz language, a member of South Caucasian language family mainly spoken on northeastern coastline of Turkey.

Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection

no code implementations LREC 2020 Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Jan Hajič, Christopher D. Manning, Sampo Pyysalo, Sebastian Schuster, Francis Tyers, Daniel Zeman

Universal Dependencies is an open community effort to create cross-linguistically consistent treebank annotation for many languages within a dependency-based lexicalist framework.

An Unsupervised Method for Weighting Finite-state Morphological Analyzers

2 code implementations LREC 2020 Amr Keleg, Francis Tyers, Nick Howell, Tommi Pirinen

In this paper, we have developed a method for weighting a morphological analyzer built using finite state transducers in order to disambiguate its results.

Morphological Analysis

A Finite-State Morphological Analyser for Evenki

no code implementations LREC 2020 Anna Zueva, Anastasia Kuznetsova, Francis Tyers

Since a part of the corpora belongs to texts in Evenki dialects, a version of the analyser with relaxed rules is developed for processing dialectal features.

Morphological Analysis valid

Improving the Language Model for Low-Resource ASR with Online Text Corpora

no code implementations LREC 2020 Nils Hjortnaes, Timofey Arkhangelskiy, Niko Partanen, Michael Rie{\ss}ler, Francis Tyers

Previous experiments showed that transfer learning using DeepSpeech can improve the accuracy of a speech recognizer for Komi, though the error rate remained very high.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Do RNN States Encode Abstract Phonological Processes?

no code implementations1 Apr 2021 Miikka Silfverberg, Francis Tyers, Garrett Nicolai, Mans Hulden

Sequence-to-sequence models have delivered impressive results in word formation tasks such as morphological inflection, often learning to model subtle morphophonological details with limited training data.

Memorization Morphological Inflection

Do RNN States Encode Abstract Phonological Alternations?

no code implementations NAACL 2021 Miikka Silfverberg, Francis Tyers, Garrett Nicolai, Mans Hulden

Sequence-to-sequence models have delivered impressive results in word formation tasks such as morphological inflection, often learning to model subtle morphophonological details with limited training data.

Memorization Morphological Inflection

Curriculum optimization for low-resource speech recognition

no code implementations17 Feb 2022 Anastasia Kuznetsova, Anurag Kumar, Jennifer Drexler Fox, Francis Tyers

Modern end-to-end speech recognition models show astonishing results in transcribing audio signals into written text.

speech-recognition Speech Recognition

A Universal Dependencies Treebank of Ancient Hebrew

no code implementations LREC 2022 Daniel Swanson, Francis Tyers

In this paper we present the initial construction of a Universal Dependencies treebank with morphological annotations of Ancient Hebrew containing portions of the Hebrew Scriptures (1579 sentences, 27K tokens) for use in comparative study with ancient translations and for analysis of the development of Hebrew syntax.

Handling Stress in Finite-State Morphological Analyzers for Ancient Greek and Ancient Hebrew

no code implementations LT4HALA (LREC) 2022 Daniel Swanson, Francis Tyers

However, these phenomena can be modeled fairly easily if the lexicon’s internal representation is allowed to contain more information than the pure phonological form.

Morphological Analysis

A corpus of K’iche’ annotated for morphosyntactic structure

no code implementations NAACL (AmericasNLP) 2021 Francis Tyers, Robert Henderson

This article describes a collection of sentences in K’iche’ annotated for morphology and syntax.

A survey of part-of-speech tagging approaches applied to K’iche’

no code implementations NAACL (AmericasNLP) 2021 Francis Tyers, Nick Howell

We study the performance of several popular neural part-of-speech taggers from the Universal Dependencies ecosystem on Mayan languages using a small corpus of 1435 annotated K’iche’ sentences consisting of approximately 10, 000 tokens, with encouraging results: F_1 scores 93%+ on lemmatisation, part-of-speech and morphological feature assignment.

Part-Of-Speech Tagging Sentence

A finite-state morphological analyser for Paraguayan Guaraní

no code implementations NAACL (AmericasNLP) 2021 Anastasia Kuznetsova, Francis Tyers

We assess the efficacy of the approach on publicly available Wikipedia and Bible corpora and the naive coverage of analyser reaches 86% on Wikipedia and 91% on Bible corpora.

Expanding Universal Dependencies for Polysynthetic Languages: A Case of St. Lawrence Island Yupik

no code implementations NAACL (AmericasNLP) 2021 Hyunji Park, Lane Schwartz, Francis Tyers

This paper describes the development of the first Universal Dependencies (UD) treebank for St. Lawrence Island Yupik, an endangered language spoken in the Bering Strait region.

Dependency Parsing

Universal Dependencies for Western Sierra Puebla Nahuatl

no code implementations LREC 2022 Robert Pugh, Marivel Huerta Mendez, Mitsuya Sasaki, Francis Tyers

We present a morpho-syntactically-annotated corpus of Western Sierra Puebla Nahuatl that conforms to the annotation guidelines of the Universal Dependencies project.

Universal Dependency Treebank for Xibe

no code implementations UDW (COLING) 2020 He Zhou, Juyeon Chung, Sandra Kübler, Francis Tyers

We present our work of constructing the first treebank for the Xibe language following the Universal Dependencies (UD) annotation scheme.

A Free/Open-Source Morphological Analyser and Generator for Sakha

1 code implementation LREC 2022 Sardana Ivanova, Jonathan Washington, Francis Tyers

We present, to our knowledge, the first ever published morphological analyser and generator for Sakha, a marginalised language of Siberia.

How to encode arbitrarily complex morphology in word embeddings, no corpus needed

no code implementations FieldMatters (COLING) 2022 Lane Schwartz, Coleman Haley, Francis Tyers

In this paper, we present a straightforward technique for constructing interpretable word embeddings from morphologically analyzed examples (such as interlinear glosses) for all of the world’s languages.

Word Embeddings

Cannot find the paper you are looking for? You can Submit a new open access paper.