no code implementations • LREC 2022 • Paul McNamee, Kevin Duh
Translation of the noisy, informal language found in social media has been an understudied problem, with a principal factor being the limited availability of translation corpora in many languages.
no code implementations • IWSLT (ACL) 2022 • Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe
The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.
no code implementations • 24 Apr 2023 • Dawn Lawrie, Sean MacAvaney, James Mayfield, Paul McNamee, Douglas W. Oard, Luca Soldaini, Eugene Yang
This is the first year of the TREC Neural CLIR (NeuCLIR) track, which aims to study the impact of neural approaches to cross-language information retrieval.
1 code implementation • 20 Jan 2022 • Suraj Nair, Eugene Yang, Dawn Lawrie, Kevin Duh, Paul McNamee, Kenton Murray, James Mayfield, Douglas W. Oard
These models have improved the effectiveness of retrieval systems well beyond that of lexical term matching models such as BM25.
no code implementations • LREC 2020 • Cash Costello, Shelby Anderson, Caitlyn Bishop, James Mayfield, Paul McNamee
Dragonfly is an open source software tool that supports annotation of text in a low resource language by non-speakers of the language.
no code implementations • LREC 2020 • Kevin Duh, Paul McNamee, Matt Post, Brian Thompson
In this study, we benchmark state of the art statistical and neural machine translation systems on two African languages which do not have large amounts of resources: Somali and Swahili.
no code implementations • LREC 2020 • Paul McNamee, James Mayfield, Cash Costello, Caitlyn Bishop, Shelby Anderson
Throughout this time the majority of such work has focused on detection and classification of entities into coarse-grained types like: PERSON, ORGANIZATION, and LOCATION.
no code implementations • WS 2019 • Tom Lippincott, Pamela Shapiro, Kevin Duh, Paul McNamee
Our submission to the MADAR shared task on Arabic dialect identification employed a language modeling technique called Prediction by Partial Matching, an ensemble of neural architectures, and sources of additional data for training word embeddings and auxiliary language models.
no code implementations • NAACL 2019 • Xuan Zhang, Pamela Shapiro, Gaurav Kumar, Paul McNamee, Marine Carpuat, Kevin Duh
We introduce a curriculum learning approach to adapt generic neural machine translation models to a specific domain.
1 code implementation • 2 Nov 2018 • Xuan Zhang, Gaurav Kumar, Huda Khayrallah, Kenton Murray, Jeremy Gwinnup, Marianna J. Martindale, Paul McNamee, Kevin Duh, Marine Carpuat
Machine translation systems based on deep neural networks are expensive to train.
1 code implementation • WS 2018 • Brian Thompson, Huda Khayrallah, Antonios Anastasopoulos, Arya D. McCarthy, Kevin Duh, Rebecca Marvin, Paul McNamee, Jeremy Gwinnup, Tim Anderson, Philipp Koehn
To better understand the effectiveness of continued training, we analyze the major components of a neural machine translation system (the encoder, decoder, and each embedding space) and consider each component's contribution to, and capacity for, domain adaptation.
no code implementations • ACL 2018 • Ying Lin, Cash Costello, Boliang Zhang, Di Lu, Heng Ji, James Mayfield, Paul McNamee
We demonstrate two annotation platforms that allow an English speaker to annotate names for any language without knowing the language.
no code implementations • 1 Jun 2017 • Jan Trmal, Gaurav Kumar, Vimal Manohar, Sanjeev Khudanpur, Matt Post, Paul McNamee
The paper summarizes the development of the LVCSR system built as a part of the Pashto speech-translation system at the SCALE (Summer Camp for Applied Language Exploration) 2015 workshop on "Speech-to-text-translation for low-resource languages".
no code implementations • WS 2017 • James Mayfield, Paul McNamee, Cash Costello
The 2017 shared task at the Balto-Slavic NLP workshop requires identifying coarse-grained named entities in seven languages, identifying each entity{'}s base form, and clustering name mentions across the multilingual set of documents.
no code implementations • WS 2016 • Paul McNamee
The DSL 2016 shared task continued previous evaluations from 2014 and 2015 that facilitated the study of automated language and dialect identification.
no code implementations • 31 May 2015 • Travis Wolfe, Mark Dredze, James Mayfield, Paul McNamee, Craig Harman, Tim Finin, Benjamin Van Durme
Most work on building knowledge bases has focused on collecting entities and facts from as large a collection of documents as possible.
no code implementations • LREC 2012 • Dawn Lawrie, James Mayfield, Paul McNamee, Douglas Oard
To stimulate research in cross-language entity linking, we present a new test collection for evaluating the accuracy of cross-language entity linking in twenty-one languages.