Search Results for author: Eugene Yang

Found 22 papers, 8 papers with code

HLTCOE at TREC 2023 NeuCLIR Track

no code implementations11 Apr 2024 Eugene Yang, Dawn Lawrie, James Mayfield

TT trains a ColBERT model with English queries and passages automatically translated into the document language from the MS-MARCO v1 collection.

Document Translation

Extending Translate-Train for ColBERT-X to African Language CLIR

no code implementations11 Apr 2024 Eugene Yang, Dawn J. Lawrie, Paul McNamee, James Mayfield

This paper describes the submission runs from the HLTCOE team at the CIRAL CLIR tasks for African languages at FIRE 2023.

Machine Translation Retrieval +1

Overview of the TREC 2023 NeuCLIR Track

no code implementations11 Apr 2024 Dawn Lawrie, Sean MacAvaney, James Mayfield, Paul McNamee, Douglas W. Oard, Luca Soldaini, Eugene Yang

The principal tasks are ranked retrieval of news in one of the three languages, using English topics.

Information Retrieval Retrieval

Translate-Distill: Learning Cross-Language Dense Retrieval by Translation and Distillation

1 code implementation9 Jan 2024 Eugene Yang, Dawn Lawrie, James Mayfield, Douglas W. Oard, Scott Miller

Applying a similar knowledge distillation approach to training an efficient dual-encoder model for Cross-Language Information Retrieval (CLIR), where queries and documents are in different languages, is challenging due to the lack of a sufficiently large training collection when the query and document languages differ.

Information Retrieval Knowledge Distillation +2

Synthetic Cross-language Information Retrieval Training Data

no code implementations29 Apr 2023 James Mayfield, Eugene Yang, Dawn Lawrie, Samuel Barham, Orion Weller, Marc Mason, Suraj Nair, Scott Miller

By repeating this process, collections of arbitrary size can be created in the style of MS MARCO but using naturally-occurring documents in any desired genre and domain of discourse.

Information Retrieval Language Modelling +4

Overview of the TREC 2022 NeuCLIR Track

no code implementations24 Apr 2023 Dawn Lawrie, Sean MacAvaney, James Mayfield, Paul McNamee, Douglas W. Oard, Luca Soldaini, Eugene Yang

This is the first year of the TREC Neural CLIR (NeuCLIR) track, which aims to study the impact of neural approaches to cross-language information retrieval.

Information Retrieval Retrieval

Parameter-efficient Zero-shot Transfer for Cross-Language Dense Retrieval with Adapters

no code implementations20 Dec 2022 Eugene Yang, Suraj Nair, Dawn Lawrie, James Mayfield, Douglas W. Oard

By adding adapters pretrained on language tasks for a specific language with task-specific adapters, prior work has shown that the adapter-enhanced models perform better than fine-tuning the entire model when transferring across languages in various NLP tasks.

Information Retrieval Language Modelling +1

Neural Approaches to Multilingual Information Retrieval

1 code implementation3 Sep 2022 Dawn Lawrie, Eugene Yang, Douglas W. Oard, James Mayfield

Providing access to information across languages has been a goal of Information Retrieval (IR) for decades.

Document Translation Information Retrieval +3

TARexp: A Python Framework for Technology-Assisted Review Experiments

1 code implementation23 Feb 2022 Eugene Yang, David D. Lewis

Technology-assisted review (TAR) is an important industrial application of information retrieval (IR) and machine learning (ML).

Retrieval TAR

HC4: A New Suite of Test Collections for Ad Hoc CLIR

1 code implementation24 Jan 2022 Dawn Lawrie, James Mayfield, Douglas Oard, Eugene Yang

HC4 is a new suite of test collections for ad hoc Cross-Language Information Retrieval (CLIR), with Common Crawl News documents in Chinese, Persian, and Russian, topics in English and in the document languages, and graded relevance judgments.

Active Learning Information Retrieval +1

Patapasco: A Python Framework for Cross-Language Information Retrieval Experiments

1 code implementation24 Jan 2022 Cash Costello, Eugene Yang, Dawn Lawrie, James Mayfield

While there are high-quality software frameworks for information retrieval experimentation, they do not explicitly support cross-language information retrieval (CLIR).

Information Retrieval Retrieval

Certifying One-Phase Technology-Assisted Reviews

no code implementations29 Aug 2021 David D. Lewis, Eugene Yang, Ophir Frieder

Technology-assisted review (TAR) workflows based on iterative active learning are widely used in document review applications.

Active Learning TAR +1

TAR on Social Media: A Framework for Online Content Moderation

1 code implementation29 Aug 2021 Eugene Yang, David D. Lewis, Ophir Frieder

Content moderation (removing or limiting the distribution of posts based on their contents) is one tool social networks use to fight problems such as harassment and disinformation.

Active Learning Retrieval +1

Heuristic Stopping Rules For Technology-Assisted Review

no code implementations18 Jun 2021 Eugene Yang, David D. Lewis, Ophir Frieder

Technology-assisted review (TAR) refers to human-in-the-loop active learning workflows for finding relevant documents in large collections.

Active Learning TAR

On Minimizing Cost in Legal Document Review Workflows

1 code implementation18 Jun 2021 Eugene Yang, David D. Lewis, Ophir Frieder

Technology-assisted review (TAR) refers to human-in-the-loop machine learning workflows for document review in legal discovery and other high recall review tasks.

Active Learning TAR

Goldilocks: Just-Right Tuning of BERT for Technology-Assisted Review

no code implementations3 May 2021 Eugene Yang, Sean MacAvaney, David D. Lewis, Ophir Frieder

We indeed find that the pre-trained BERT model reduces review cost by 10% to 15% in TAR workflows simulated on the RCV1-v2 newswire collection.

Active Learning Language Modelling +4

ToxCCIn: Toxic Content Classification with Interpretability

no code implementations EACL (WASSA) 2021 Tong Xiang, Sean MacAvaney, Eugene Yang, Nazli Goharian

Despite the recent successes of transformer-based models in terms of effectiveness on a variety of tasks, their decisions often remain opaque to humans.

Classification General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.