Search Results for author: Stephen Mayhew

Found 21 papers, 6 papers with code

Building Low-Resource NER Models Using Non-Speaker Annotations

no code implementations NAACL (DaSH) 2021 Tatiana Tsygankova, Francesca Marini, Stephen Mayhew, Dan Roth

In low-resource natural language processing (NLP), the key problems are a lack of target language training data, and a lack of native speakers to create it.

Low Resource Named Entity Recognition named-entity-recognition +2

MasakhaNER: Named Entity Recognition for African Languages

2 code implementations22 Mar 2021 David Ifeoluwa Adelani, Jade Abbott, Graham Neubig, Daniel D'souza, Julia Kreutzer, Constantine Lignos, Chester Palen-Michel, Happy Buzaaba, Shruti Rijhwani, Sebastian Ruder, Stephen Mayhew, Israel Abebe Azime, Shamsuddeen Muhammad, Chris Chinenye Emezue, Joyce Nakatumba-Nabende, Perez Ogayo, Anuoluwapo Aremu, Catherine Gitau, Derguene Mbaye, Jesujoba Alabi, Seid Muhie Yimam, Tajuddeen Gwadabe, Ignatius Ezeani, Rubungo Andre Niyongabo, Jonathan Mukiibi, Verrah Otiende, Iroro Orife, Davis David, Samba Ngom, Tosin Adewumi, Paul Rayson, Mofetoluwa Adeyemi, Gerald Muriuki, Emmanuel Anebi, Chiamaka Chukwuneke, Nkiruka Odu, Eric Peter Wairagala, Samuel Oyerinde, Clemencia Siro, Tobius Saul Bateesa, Temilola Oloyede, Yvonne Wambui, Victor Akinode, Deborah Nabagereka, Maurice Katusiime, Ayodele Awokoya, Mouhamadane MBOUP, Dibora Gebreyohannes, Henok Tilaye, Kelechi Nwaike, Degaga Wolde, Abdoulaye Faye, Blessing Sibanda, Orevaoghene Ahia, Bonaventure F. P. Dossou, Kelechi Ogueji, Thierno Ibrahima DIOP, Abdoulaye Diallo, Adewale Akinfaderin, Tendai Marengereke, Salomey Osei

We take a step towards addressing the under-representation of the African continent in NLP research by creating the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages, bringing together a variety of stakeholders.

named-entity-recognition Named Entity Recognition +2

Building Low-Resource NER Models Using Non-Speaker Annotation

no code implementations17 Jun 2020 Tatiana Tsygankova, Francesca Marini, Stephen Mayhew, Dan Roth

In low-resource natural language processing (NLP), the key problems are a lack of target language training data, and a lack of native speakers to create it.

Low Resource Named Entity Recognition named-entity-recognition +2

Cross-Lingual Ability of Multilingual BERT: An Empirical Study

no code implementations ICLR 2020 Karthikeyan K, Zihan Wang, Stephen Mayhew, Dan Roth

Recent work has exhibited the surprising cross-lingual abilities of multilingual BERT (M-BERT) -- surprising since it is trained without any cross-lingual objective and with no aligned data.

named-entity-recognition Named Entity Recognition +2

Robust Named Entity Recognition with Truecasing Pretraining

no code implementations15 Dec 2019 Stephen Mayhew, Nitish Gupta, Dan Roth

Although modern named entity recognition (NER) systems show impressive performance on standard datasets, they perform poorly when presented with noisy data.

named-entity-recognition Named Entity Recognition +1

Named Entity Recognition with Partially Annotated Training Data

no code implementations CONLL 2019 Stephen Mayhew, Snigdha Chaturvedi, Chen-Tse Tsai, Dan Roth

Supervised machine learning assumes the availability of fully-labeled data, but in many cases, such as low-resource languages, the only data available is partially annotated.

named-entity-recognition Named Entity Recognition +1

BSNLP2019 Shared Task Submission: Multisource Neural NER Transfer

no code implementations WS 2019 Tatiana Tsygankova, Stephen Mayhew, Dan Roth

This paper describes the Cognitive Computation (CogComp) Group{'}s submissions to the multilingual named entity recognition shared task at the Balto-Slavic Natural Language Processing (BSNLP) Workshop.

Multilingual Named Entity Recognition named-entity-recognition +2

Legal Linking: Citation Resolution and Suggestion in Constitutional Law

1 code implementation WS 2019 Robert Shaffer, Stephen Mayhew

This paper describes a dataset and baseline systems for linking paragraphs from court cases to clauses or amendments in the US Constitution.

ner and pos when nothing is capitalized

no code implementations IJCNLP 2019 Stephen Mayhew, Tatiana Tsygankova, Dan Roth

While prior work and first impressions might suggest training a caseless model, or using a truecaser at test time, we show that the most effective strategy is a concatenation of cased and lowercased training data, producing a single model with high performance on both cased and uncased text.

Machine Translation named-entity-recognition +6

TALEN: Tool for Annotation of Low-resource ENtities

1 code implementation ACL 2018 Stephen Mayhew, Dan Roth

We present a new web-based interface, TALEN, designed for named entity annotation in low-resource settings where the annotators do not speak the language.

Named Entity Recognition (NER)

Simple Features for Strong Performance on Named Entity Recognition in Code-Switched Twitter Data

no code implementations WS 2018 Devanshu Jain, Maria Kustikova, Mayank Darbari, Rishabh Gupta, Stephen Mayhew

In this work, we address the problem of Named Entity Recognition (NER) in code-switched tweets as a part of the Workshop on Computational Approaches to Linguistic Code-switching (CALCS) at ACL{'}18.

Language Identification named-entity-recognition +4

Cheap Translation for Cross-Lingual Named Entity Recognition

no code implementations EMNLP 2017 Stephen Mayhew, Chen-Tse Tsai, Dan Roth

Recent work in NLP has attempted to deal with low-resource languages but still assumed a resource level that is not present for most languages, e. g., the availability of Wikipedia in the target language.

Cross-Lingual NER named-entity-recognition +3

Cross-lingual Dataless Classification for Languages with Small Wikipedia Presence

no code implementations13 Nov 2016 Yangqiu Song, Stephen Mayhew, Dan Roth

We use a word-level dictionary to convert documents in a SWL to a large-Wikipedia language (LWLs), and then perform CLDDC based on the LWL's Wikipedia.

Classification Document Classification +4

Transliteration in Any Language with Surrogate Languages

no code implementations14 Sep 2016 Stephen Mayhew, Christos Christodoulopoulos, Dan Roth

We introduce a method for transliteration generation that can produce transliterations in every language.


ILLINOISCLOUDNLP: Text Analytics Services in the Cloud

no code implementations LREC 2014 Hao Wu, Zhiye Fei, Aaron Dai, Mark Sammons, Dan Roth, Stephen Mayhew

Natural Language Processing (NLP) continues to grow in popularity in a range of research and commercial applications.

Knowledge Base Population

Cannot find the paper you are looking for? You can Submit a new open access paper.