Search Results for author: Stephen Mayhew

Found 21 papers, 6 papers with code

Building Low-Resource NER Models Using Non-Speaker Annotations

no code implementations • NAACL (DaSH) 2021 • Tatiana Tsygankova, Francesca Marini, Stephen Mayhew, Dan Roth

In low-resource natural language processing (NLP), the key problems are a lack of target language training data, and a lack of native speakers to create it.

Low Resource Named Entity Recognition named-entity-recognition +2

Paper
Add Code

Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark

1 code implementation • arXiv 2023 • Stephen Mayhew, Terra Blevins, Shuheng Liu, Marek Šuppa, Hila Gonen, Joseph Marvin Imperial, Börje F. Karlsson, Peiqin Lin, Nikola Ljubešić, LJ Miranda, Barbara Plank, Arij Riabi, Yuval Pinter

We introduce Universal NER (UNER), an open, community-driven project to develop gold-standard NER benchmarks in many languages.

Ranked #1 on Named Entity Recognition (NER) on UNER v1 (Danish)

Cross-Lingual NER Multilingual Named Entity Recognition +3

Paper
Code

MasakhaNER: Named Entity Recognition for African Languages

2 code implementations • 22 Mar 2021 • David Ifeoluwa Adelani, Jade Abbott, Graham Neubig, Daniel D'souza, Julia Kreutzer, Constantine Lignos, Chester Palen-Michel, Happy Buzaaba, Shruti Rijhwani, Sebastian Ruder, Stephen Mayhew, Israel Abebe Azime, Shamsuddeen Muhammad, Chris Chinenye Emezue, Joyce Nakatumba-Nabende, Perez Ogayo, Anuoluwapo Aremu, Catherine Gitau, Derguene Mbaye, Jesujoba Alabi, Seid Muhie Yimam, Tajuddeen Gwadabe, Ignatius Ezeani, Rubungo Andre Niyongabo, Jonathan Mukiibi, Verrah Otiende, Iroro Orife, Davis David, Samba Ngom, Tosin Adewumi, Paul Rayson, Mofetoluwa Adeyemi, Gerald Muriuki, Emmanuel Anebi, Chiamaka Chukwuneke, Nkiruka Odu, Eric Peter Wairagala, Samuel Oyerinde, Clemencia Siro, Tobius Saul Bateesa, Temilola Oloyede, Yvonne Wambui, Victor Akinode, Deborah Nabagereka, Maurice Katusiime, Ayodele Awokoya, Mouhamadane MBOUP, Dibora Gebreyohannes, Henok Tilaye, Kelechi Nwaike, Degaga Wolde, Abdoulaye Faye, Blessing Sibanda, Orevaoghene Ahia, Bonaventure F. P. Dossou, Kelechi Ogueji, Thierno Ibrahima DIOP, Abdoulaye Diallo, Adewale Akinfaderin, Tendai Marengereke, Salomey Osei

We take a step towards addressing the under-representation of the African continent in NLP research by creating the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages, bringing together a variety of stakeholders.

named-entity-recognition Named Entity Recognition +2

Paper
Code

Simultaneous Translation and Paraphrase for Language Education

1 code implementation • WS 2020 • Stephen Mayhew, Klinton Bicknell, Chris Brust, Bill McDowell, Will Monroe, Burr Settles

We present the task of Simultaneous Translation and Paraphrasing for Language Education (STAPLE).

Machine Translation Multilingual NLP +2

Paper
Code

Building Low-Resource NER Models Using Non-Speaker Annotation

no code implementations • 17 Jun 2020 • Tatiana Tsygankova, Francesca Marini, Stephen Mayhew, Dan Roth

In low-resource natural language processing (NLP), the key problems are a lack of target language training data, and a lack of native speakers to create it.

Low Resource Named Entity Recognition named-entity-recognition +2

Paper
Add Code

Extending Multilingual BERT to Low-Resource Languages

no code implementations • Findings of the Association for Computational Linguistics 2020 • Zihan Wang, Karthikeyan K, Stephen Mayhew, Dan Roth

Multilingual BERT (M-BERT) has been a huge success in both supervised and zero-shot cross-lingual transfer learning.

named-entity-recognition Named Entity Recognition +3

Paper
Add Code

Cross-Lingual Ability of Multilingual BERT: An Empirical Study

no code implementations • ICLR 2020 • Karthikeyan K, Zihan Wang, Stephen Mayhew, Dan Roth

Recent work has exhibited the surprising cross-lingual abilities of multilingual BERT (M-BERT) -- surprising since it is trained without any cross-lingual objective and with no aligned data.

named-entity-recognition Named Entity Recognition +2

Paper
Add Code

Robust Named Entity Recognition with Truecasing Pretraining

no code implementations • 15 Dec 2019 • Stephen Mayhew, Nitish Gupta, Dan Roth

Although modern named entity recognition (NER) systems show impressive performance on standard datasets, they perform poorly when presented with noisy data.

Ranked #10 on Named Entity Recognition (NER) on WNUT 2017

named-entity-recognition Named Entity Recognition +1

Paper
Add Code

Named Entity Recognition with Partially Annotated Training Data

no code implementations • CONLL 2019 • Stephen Mayhew, Snigdha Chaturvedi, Chen-Tse Tsai, Dan Roth

Supervised machine learning assumes the availability of fully-labeled data, but in many cases, such as low-resource languages, the only data available is partially annotated.

named-entity-recognition Named Entity Recognition +1

Paper
Add Code

BSNLP2019 Shared Task Submission: Multisource Neural NER Transfer

no code implementations • WS 2019 • Tatiana Tsygankova, Stephen Mayhew, Dan Roth

This paper describes the Cognitive Computation (CogComp) Group{'}s submissions to the multilingual named entity recognition shared task at the Balto-Slavic Natural Language Processing (BSNLP) Workshop.

Multilingual Named Entity Recognition named-entity-recognition +2

Paper
Add Code

Legal Linking: Citation Resolution and Suggestion in Constitutional Law

1 code implementation • WS 2019 • Robert Shaffer, Stephen Mayhew

This paper describes a dataset and baseline systems for linking paragraphs from court cases to clauses or amendments in the US Constitution.

Paper
Code

ner and pos when nothing is capitalized

no code implementations • IJCNLP 2019 • Stephen Mayhew, Tatiana Tsygankova, Dan Roth

While prior work and first impressions might suggest training a caseless model, or using a truecaser at test time, we show that the most effective strategy is a concatenation of cased and lowercased training data, producing a single model with high performance on both cased and uncased text.

Machine Translation named-entity-recognition +6

Paper
Add Code

On the Strength of Character Language Models for Multilingual Named Entity Recognition

no code implementations • EMNLP 2018 • Xiaodong Yu, Stephen Mayhew, Mark Sammons, Dan Roth

Character-level patterns have been widely used as features in English Named Entity Recognition (NER) systems.

Multilingual Named Entity Recognition named-entity-recognition +2

Paper
Add Code

TALEN: Tool for Annotation of Low-resource ENtities

1 code implementation • ACL 2018 • Stephen Mayhew, Dan Roth

We present a new web-based interface, TALEN, designed for named entity annotation in low-resource settings where the annotators do not speak the language.

Named Entity Recognition (NER)

112

Paper
Code

Simple Features for Strong Performance on Named Entity Recognition in Code-Switched Twitter Data

no code implementations • WS 2018 • Devanshu Jain, Maria Kustikova, Mayank Darbari, Rishabh Gupta, Stephen Mayhew

In this work, we address the problem of Named Entity Recognition (NER) in code-switched tweets as a part of the Workshop on Computational Approaches to Linguistic Code-switching (CALCS) at ACL{'}18.

Language Identification named-entity-recognition +5

Paper
Add Code

CogCompNLP: Your Swiss Army Knife for NLP

1 code implementation • LREC 2018 • Daniel Khashabi, Mark Sammons, Ben Zhou, Tom Redman, Christos Christodoulopoulos, Vivek Srikumar, Nicholas Rizzolo, Lev Ratinov, Guanheng Luo, Quang Do, Chen-Tse Tsai, Subhro Roy, Stephen Mayhew, Zhili Feng, John Wieting, Xiaodong Yu, Yangqiu Song, Shashank Gupta, Shyam Upadhyay, Naveen Arivazhagan, Qiang Ning, Shaoshi Ling, Dan Roth

Semantic Role Labeling

469

Paper
Code

Cheap Translation for Cross-Lingual Named Entity Recognition

no code implementations • EMNLP 2017 • Stephen Mayhew, Chen-Tse Tsai, Dan Roth

Recent work in NLP has attempted to deal with low-resource languages but still assumed a resource level that is not present for most languages, e. g., the availability of Wikipedia in the target language.

Cross-Lingual NER named-entity-recognition +3

Paper
Add Code

Cross-lingual Dataless Classification for Languages with Small Wikipedia Presence

no code implementations • 13 Nov 2016 • Yangqiu Song, Stephen Mayhew, Dan Roth

We use a word-level dictionary to convert documents in a SWL to a large-Wikipedia language (LWLs), and then perform CLDDC based on the LWL's Wikipedia.

Classification Document Classification +4