Search Results for author: Paul McNamee

Found 25 papers, 3 papers with code

The Multilingual Microblog Translation Corpus: Improving and Evaluating Translation of User-Generated Text

no code implementations LREC 2022 Paul McNamee, Kevin Duh

Translation of the noisy, informal language found in social media has been an understudied problem, with a principal factor being the limited availability of translation corpora in many languages.

Machine Translation NMT +1

Findings of the IWSLT 2022 Evaluation Campaign

no code implementations IWSLT (ACL) 2022 Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe

The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.

Speech-to-Speech Translation Translation

Extending Translate-Train for ColBERT-X to African Language CLIR

no code implementations11 Apr 2024 Eugene Yang, Dawn J. Lawrie, Paul McNamee, James Mayfield

This paper describes the submission runs from the HLTCOE team at the CIRAL CLIR tasks for African languages at FIRE 2023.

Machine Translation Retrieval +1

Overview of the TREC 2023 NeuCLIR Track

no code implementations11 Apr 2024 Dawn Lawrie, Sean MacAvaney, James Mayfield, Paul McNamee, Douglas W. Oard, Luca Soldaini, Eugene Yang

The principal tasks are ranked retrieval of news in one of the three languages, using English topics.

Information Retrieval Retrieval

Overview of the TREC 2022 NeuCLIR Track

no code implementations24 Apr 2023 Dawn Lawrie, Sean MacAvaney, James Mayfield, Paul McNamee, Douglas W. Oard, Luca Soldaini, Eugene Yang

This is the first year of the TREC Neural CLIR (NeuCLIR) track, which aims to study the impact of neural approaches to cross-language information retrieval.

Information Retrieval Retrieval

Dragonfly: Advances in Non-Speaker Annotation for Low Resource Languages

no code implementations LREC 2020 Cash Costello, Shelby Anderson, Caitlyn Bishop, James Mayfield, Paul McNamee

Dragonfly is an open source software tool that supports annotation of text in a low resource language by non-speakers of the language.

Benchmarking Neural and Statistical Machine Translation on Low-Resource African Languages

no code implementations LREC 2020 Kevin Duh, Paul McNamee, Matt Post, Brian Thompson

In this study, we benchmark state of the art statistical and neural machine translation systems on two African languages which do not have large amounts of resources: Somali and Swahili.

Benchmarking Machine Translation +2

Tagging Location Phrases in Text

no code implementations LREC 2020 Paul McNamee, James Mayfield, Cash Costello, Caitlyn Bishop, Shelby Anderson

Throughout this time the majority of such work has focused on detection and classification of entities into coarse-grained types like: PERSON, ORGANIZATION, and LOCATION.

Humanitarian

JHU System Description for the MADAR Arabic Dialect Identification Shared Task

no code implementations WS 2019 Tom Lippincott, Pamela Shapiro, Kevin Duh, Paul McNamee

Our submission to the MADAR shared task on Arabic dialect identification employed a language modeling technique called Prediction by Partial Matching, an ensemble of neural architectures, and sources of additional data for training word embeddings and auxiliary language models.

Dialect Identification Language Modelling +1

Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation

1 code implementation WS 2018 Brian Thompson, Huda Khayrallah, Antonios Anastasopoulos, Arya D. McCarthy, Kevin Duh, Rebecca Marvin, Paul McNamee, Jeremy Gwinnup, Tim Anderson, Philipp Koehn

To better understand the effectiveness of continued training, we analyze the major components of a neural machine translation system (the encoder, decoder, and each embedding space) and consider each component's contribution to, and capacity for, domain adaptation.

Decoder Domain Adaptation +2

Platforms for Non-speakers Annotating Names in Any Language

no code implementations ACL 2018 Ying Lin, Cash Costello, Boliang Zhang, Di Lu, Heng Ji, James Mayfield, Paul McNamee

We demonstrate two annotation platforms that allow an English speaker to annotate names for any language without knowing the language.

Using of heterogeneous corpora for training of an ASR system

no code implementations1 Jun 2017 Jan Trmal, Gaurav Kumar, Vimal Manohar, Sanjeev Khudanpur, Matt Post, Paul McNamee

The paper summarizes the development of the LVCSR system built as a part of the Pashto speech-translation system at the SCALE (Summer Camp for Applied Language Exploration) 2015 workshop on "Speech-to-text-translation for low-resource languages".

speech-recognition Speech Recognition +2

Language-Independent Named Entity Analysis Using Parallel Projection and Rule-Based Disambiguation

no code implementations WS 2017 James Mayfield, Paul McNamee, Cash Costello

The 2017 shared task at the Balto-Slavic NLP workshop requires identifying coarse-grained named entities in seven languages, identifying each entity{'}s base form, and clustering name mentions across the multilingual set of documents.

Clustering named-entity-recognition +2

Language and Dialect Discrimination Using Compression-Inspired Language Models

no code implementations WS 2016 Paul McNamee

The DSL 2016 shared task continued previous evaluations from 2014 and 2015 that facilitated the study of automated language and dialect identification.

Authorship Attribution Dialect Identification +6

Interactive Knowledge Base Population

no code implementations31 May 2015 Travis Wolfe, Mark Dredze, James Mayfield, Paul McNamee, Craig Harman, Tim Finin, Benjamin Van Durme

Most work on building knowledge bases has focused on collecting entities and facts from as large a collection of documents as possible.

Knowledge Base Population

Creating and Curating a Cross-Language Person-Entity Linking Collection

no code implementations LREC 2012 Dawn Lawrie, James Mayfield, Paul McNamee, Douglas Oard

To stimulate research in cross-language entity linking, we present a new test collection for evaluating the accuracy of cross-language entity linking in twenty-one languages.

Entity Linking Knowledge Base Population +1

Cannot find the paper you are looking for? You can Submit a new open access paper.