no code implementations • 20 Dec 2022 • Eugene Yang, Suraj Nair, Dawn Lawrie, James Mayfield, Douglas W. Oard
By adding adapters pretrained on language tasks for a specific language with task-specific adapters, prior work has shown that the adapter-enhanced models perform better than fine-tuning the entire model when transferring across languages in various NLP tasks.
no code implementations • 3 Sep 2022 • Dawn Lawrie, Eugene Yang, Douglas W. Oard, James Mayfield
Providing access to information across languages has been a goal of Information Retrieval (IR) for decades.
no code implementations • 25 Apr 2022 • Eugene Yang, Suraj Nair, Ramraj Chandradevan, Rebecca Iglesias-Flores, Douglas W. Oard
Pretrained language models have improved effectiveness on numerous tasks, including ad-hoc retrieval.
no code implementations • 20 Jan 2022 • Suraj Nair, Eugene Yang, Dawn Lawrie, Kevin Duh, Paul McNamee, Kenton Murray, James Mayfield, Douglas W. Oard
These models have improved the effectiveness of retrieval systems well beyond that of lexical term matching models such as BM25.
no code implementations • 20 Nov 2021 • Behrooz Mansouri, Douglas W. Oard, Anurag Agarwal, Richard Zanibbi
There are now several test collections for the formula retrieval task, in which a system's goal is to identify useful mathematical formulae to show in response to a query posed as a formula.
no code implementations • 10 Nov 2021 • Petra Galuščáková, Douglas W. Oard, Suraj Nair
Two key assumptions shape the usual view of ranked retrieval: (1) that the searcher can choose words for their query that might appear in the documents that they wish to see, and (2) that ranking retrieved documents will suffice because the searcher will be able to recognize those which they wished to find.
no code implementations • ACL 2021 • Yanda Chen, Chris Kedzie, Suraj Nair, Petra Galuščáková, Rui Zhang, Douglas W. Oard, Kathleen McKeown
This paper proposes an approach to cross-language sentence selection in a low-resource setting.
no code implementations • 2 Feb 2021 • Elizabeth Salesky, Matthew Wiesner, Jacob Bremerman, Roldano Cattoni, Matteo Negri, Marco Turchi, Douglas W. Oard, Matt Post
We present the Multilingual TEDx corpus, built to support speech recognition (ASR) and speech translation (ST) research across many non-English source languages.
no code implementations • 14 Nov 2020 • Jason R. Baron, Mahmoud F. Sayed, Douglas W. Oard
At present, the review process for material that is exempt from disclosure under the Freedom of Information Act (FOIA) in the United States of America, and under many similar government transparency regimes worldwide, is entirely manual.