Search Results for author: Constantine Lignos

Found 15 papers, 6 papers with code

If You Build Your Own NER Scorer, Non-replicable Results Will Come

no code implementations EMNLP (insights) 2020 Constantine Lignos, Marjan Kamyab

We propose best practices to increase the replicability of NER evaluations by increasing transparency regarding the handling of improper label sequences.

Named Entity Recognition NER

Effective Architectures for Low Resource Multilingual Named Entity Transliteration

no code implementations loresmt (AACL) 2020 Molly Moran, Constantine Lignos

In this paper, we evaluate LSTM, biLSTM, GRU, and Transformer architectures for the task of name transliteration in a many-to-one multilingual paradigm, transliterating from 590 languages to English.

Transliteration

Detecting Unassimilated Borrowings in Spanish: An Annotated Corpus and Approaches to Modeling

no code implementations ACL 2022 Elena Álvarez-Mellado, Constantine Lignos

This work presents a new resource for borrowing identification and analyzes the performance and errors of several models on this task.

Word Embeddings

ParaNames: A Massively Multilingual Entity Name Corpus

1 code implementation28 Feb 2022 Jonne Sälevä, Constantine Lignos

This preprint describes work in progress on ParaNames, a multilingual parallel name resource consisting of names for approximately 14 million entities.

Named Entity Recognition Translation +1

Toward More Meaningful Resources for Lower-resourced Languages

no code implementations Findings (ACL) 2022 Constantine Lignos, Nolan Holley, Chester Palen-Michel, Jonne Sälevä

We then discuss the importance of creating annotation for lower-resourced languages in a thoughtful and ethical way that includes the languages' speakers as part of the development process.

Multilingual Open Text 1.0: Public Domain News in 44 Languages

2 code implementations14 Jan 2022 Chester Palen-Michel, June Kim, Constantine Lignos

We present a new multilingual corpus containing text in 44 languages, many of which have relatively few existing resources for natural language processing.

SeqScore: Addressing Barriers to Reproducible Named Entity Recognition Evaluation

1 code implementation EMNLP (Eval4NLP) 2021 Chester Palen-Michel, Nolan Holley, Constantine Lignos

To address a looming crisis of unreproducible evaluation for named entity recognition, we propose guidelines and introduce SeqScore, a software package to improve reproducibility.

Named Entity Recognition NER

Macro-Average: Rare Types Are Important Too

1 code implementation NAACL 2021 Thamme Gowda, Weiqiu You, Constantine Lignos, Jonathan May

While traditional corpus-level evaluation metrics for machine translation (MT) correlate well with fluency, they struggle to reflect adequacy.

Information Retrieval Machine Translation +1

Mining Wikidata for Name Resources for African Languages

1 code implementation1 Apr 2021 Jonne Sälevä, Constantine Lignos

This work supports further development of language technology for the languages of Africa by providing a Wikidata-derived resource of name lists corresponding to common entity types (person, location, and organization).

TMR: Evaluating NER Recall on Tough Mentions

no code implementations EACL 2021 Jingxuan Tu, Constantine Lignos

We propose the Tough Mentions Recall (TMR) metrics to supplement traditional named entity recognition (NER) evaluation by examining recall on specific subsets of "tough" mentions: unseen mentions, those whose tokens or token/type combination were not observed in training, and type-confusable mentions, token sequences with multiple entity types in the test data.

Named Entity Recognition NER

SARAL: A Low-Resource Cross-Lingual Domain-Focused Information Retrieval System for Effective Rapid Document Triage

no code implementations ACL 2019 Elizabeth Boschee, Joel Barry, Jayadev Billa, Marjorie Freedman, Thamme Gowda, Constantine Lignos, Chester Palen-Michel, Michael Pust, Banriskhem Kayang Khonglah, Srikanth Madikeri, Jonathan May, Scott Miller

In this paper we present an end-to-end cross-lingual information retrieval (CLIR) and summarization system for low-resource languages that 1) enables English speakers to search foreign language repositories of text and audio using English queries, 2) summarizes the retrieved documents in English with respect to a particular information need, and 3) provides complete transcriptions and translations as needed.

Information Retrieval Machine Translation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.