Search Results for author: Douglas W. Oard

Found 16 papers, 4 papers with code

Neural Approaches to Multilingual Information Retrieval

1 code implementation3 Sep 2022 Dawn Lawrie, Eugene Yang, Douglas W. Oard, James Mayfield

Providing access to information across languages has been a goal of Information Retrieval (IR) for decades.

Document Translation Information Retrieval +3

Translate-Distill: Learning Cross-Language Dense Retrieval by Translation and Distillation

1 code implementation9 Jan 2024 Eugene Yang, Dawn Lawrie, James Mayfield, Douglas W. Oard, Scott Miller

Applying a similar knowledge distillation approach to training an efficient dual-encoder model for Cross-Language Information Retrieval (CLIR), where queries and documents are in different languages, is challenging due to the lack of a sufficiently large training collection when the query and document languages differ.

Information Retrieval Knowledge Distillation +2

Known by the Company it Keeps: Proximity-Based Indexing for Physical Content in Archival Repositories

1 code implementation30 May 2023 Douglas W. Oard

Despite the plethora of born-digital content, vast troves of important content remain accessible only on physical media such as paper or microfilm.

Providing More Efficient Access To Government Records: A Use Case Involving Application of Machine Learning to Improve FOIA Review for the Deliberative Process Privilege

no code implementations14 Nov 2020 Jason R. Baron, Mahmoud F. Sayed, Douglas W. Oard

At present, the review process for material that is exempt from disclosure under the Freedom of Information Act (FOIA) in the United States of America, and under many similar government transparency regimes worldwide, is entirely manual.

text-classification Text Classification

The Multilingual TEDx Corpus for Speech Recognition and Translation

no code implementations2 Feb 2021 Elizabeth Salesky, Matthew Wiesner, Jacob Bremerman, Roldano Cattoni, Matteo Negri, Marco Turchi, Douglas W. Oard, Matt Post

We present the Multilingual TEDx corpus, built to support speech recognition (ASR) and speech translation (ST) research across many non-English source languages.

speech-recognition Speech Recognition +1

Cross-language Information Retrieval

no code implementations10 Nov 2021 Petra Galuščáková, Douglas W. Oard, Suraj Nair

Two key assumptions shape the usual view of ranked retrieval: (1) that the searcher can choose words for their query that might appear in the documents that they wish to see, and (2) that ranking retrieved documents will suffice because the searcher will be able to recognize those which they wished to find.

Information Retrieval Retrieval

Effects of context, complexity, and clustering on evaluation for math formula retrieval

no code implementations20 Nov 2021 Behrooz Mansouri, Douglas W. Oard, Anurag Agarwal, Richard Zanibbi

There are now several test collections for the formula retrieval task, in which a system's goal is to identify useful mathematical formulae to show in response to a query posed as a formula.

Clustering Math +1

Parameter-efficient Zero-shot Transfer for Cross-Language Dense Retrieval with Adapters

no code implementations20 Dec 2022 Eugene Yang, Suraj Nair, Dawn Lawrie, James Mayfield, Douglas W. Oard

By adding adapters pretrained on language tasks for a specific language with task-specific adapters, prior work has shown that the adapter-enhanced models perform better than fine-tuning the entire model when transferring across languages in various NLP tasks.

Information Retrieval Language Modelling +1

Overview of the TREC 2022 NeuCLIR Track

no code implementations24 Apr 2023 Dawn Lawrie, Sean MacAvaney, James Mayfield, Paul McNamee, Douglas W. Oard, Luca Soldaini, Eugene Yang

This is the first year of the TREC Neural CLIR (NeuCLIR) track, which aims to study the impact of neural approaches to cross-language information retrieval.

Information Retrieval Retrieval

Overview of the TREC 2023 NeuCLIR Track

no code implementations11 Apr 2024 Dawn Lawrie, Sean MacAvaney, James Mayfield, Paul McNamee, Douglas W. Oard, Luca Soldaini, Eugene Yang

The principal tasks are ranked retrieval of news in one of the three languages, using English topics.

Information Retrieval Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.