Search Results for author: Sean MacAvaney

Found 49 papers, 24 papers with code

Community-level Research on Suicidality Prediction in a Secure Environment: Overview of the CLPsych 2021 Shared Task

no code implementations NAACL (CLPsych) 2021 Sean MacAvaney, Anjali Mittu, Glen Coppersmith, Jeff Leintz, Philip Resnik

Progress on NLP for mental health — indeed, for healthcare in general — is hampered by obstacles to shared, community-level access to relevant data.

TBD3: A Thresholding-Based Dynamic Depression Detection from Social Media for Low-Resource Users

1 code implementation LREC 2022 Hrishikesh Kulkarni, Sean MacAvaney, Nazli Goharian, Ophir Frieder

To complement this evaluation, we propose a dynamic thresholding technique that adjusts the classifier’s sensitivity as a function of the number of posts a user has.

Depression Detection

A Reproducibility Study of PLAID

no code implementations23 Apr 2024 Sean MacAvaney, Nicola Tonellotto

The PLAID (Performance-optimized Late Interaction Driver) algorithm for ColBERTv2 uses clustered term representations to retrieve and progressively prune documents for final (exact) document scoring.

Re-Ranking Retrieval

Overview of the TREC 2023 NeuCLIR Track

no code implementations11 Apr 2024 Dawn Lawrie, Sean MacAvaney, James Mayfield, Paul McNamee, Douglas W. Oard, Luca Soldaini, Eugene Yang

The principal tasks are ranked retrieval of news in one of the three languages, using English topics.

Information Retrieval Retrieval

Shallow Cross-Encoders for Low-Latency Retrieval

1 code implementation29 Mar 2024 Aleksandr V. Petrov, Sean MacAvaney, Craig Macdonald

However, Cross-Encoders based on large transformer models (such as BERT or T5) are computationally expensive and allow for scoring only a small number of documents within a reasonably small latency window.

Passage Ranking Retrieval +1

FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions

2 code implementations22 Mar 2024 Orion Weller, Benjamin Chang, Sean MacAvaney, Kyle Lo, Arman Cohan, Benjamin Van Durme, Dawn Lawrie, Luca Soldaini

We introduce our dataset FollowIR, which contains a rigorous instruction evaluation benchmark as well as a training set for helping IR models learn to better follow real-world instructions.

Information Retrieval Retrieval

Analyzing Adversarial Attacks on Sequence-to-Sequence Relevance Models

1 code implementation12 Mar 2024 Andrew Parry, Maik Fröbe, Sean MacAvaney, Martin Potthast, Matthias Hagen

Modern sequence-to-sequence relevance models like monoT5 can effectively capture complex textual interactions between queries and documents through cross-encoding.

Retrieval

Evaluating the Explainability of Neural Rankers

no code implementations4 Mar 2024 Saran Pandian, Debasis Ganguly, Sean MacAvaney

While the increasing complexity of the search models have been able to demonstrate improvements in effectiveness (measured in terms of relevance of top-retrieved results), a question worthy of a thorough inspection is - "how explainable are these models?

Information Retrieval Sentence

A Deep Learning Approach for Selective Relevance Feedback

no code implementations20 Jan 2024 Suchana Datta, Debasis Ganguly, Sean MacAvaney, Derek Greene

Additionally, to further improve retrieval effectiveness with this selective PRF approach, we make use of the model's confidence estimates to combine the information from the original and expanded queries.

Retrieval

Generative Query Reformulation for Effective Adhoc Search

no code implementations1 Aug 2023 Xiao Wang, Sean MacAvaney, Craig Macdonald, Iadh Ounis

GenQR directly reformulates the user's input query, while GenPRF provides additional context for the query by making use of pseudo-relevance feedback information.

Information Retrieval Retrieval

On the Effects of Regional Spelling Conventions in Retrieval Models

1 code implementation1 Aug 2023 Andreas Chari, Sean MacAvaney, Iadh Ounis

One advantage of neural ranking models is that they are meant to generalise well in situations of synonymity i. e. where two words have similar or identical meanings.

Retrieval

Lexically-Accelerated Dense Retrieval

no code implementations31 Jul 2023 Hrishikesh Kulkarni, Sean MacAvaney, Nazli Goharian, Ophir Frieder

We introduce 'LADR' (Lexically-Accelerated Dense Retrieval), a simple-yet-effective approach that improves the efficiency of existing dense retrieval models without compromising on retrieval effectiveness.

Retrieval

Adaptive Latent Entity Expansion for Document Retrieval

no code implementations29 Jun 2023 Iain Mackie, Shubham Chatterjee, Sean MacAvaney, Jeffrey Dalton

First, we demonstrate that applying a strong neural re-ranker before sparse or dense PRF can improve the retrieval effectiveness by 5-8%.

Re-Ranking Retrieval

Online Distillation for Pseudo-Relevance Feedback

no code implementations16 Jun 2023 Sean MacAvaney, Xi Wang

Model distillation has emerged as a prominent technique to improve neural search models.

Re-Ranking Retrieval

The Information Retrieval Experiment Platform

1 code implementation30 May 2023 Maik Fröbe, Jan Heinrich Reimer, Sean MacAvaney, Niklas Deckers, Simon Reich, Janek Bevendorff, Benno Stein, Matthias Hagen, Martin Potthast

Standardization is achieved when a retrieval approach implements PyTerrier's interfaces and the input and output of an experiment are compatible with ir_datasets and ir_measures.

Information Retrieval Retrieval

Adapting Learned Sparse Retrieval for Long Documents

1 code implementation29 May 2023 Thong Nguyen, Sean MacAvaney, Andrew Yates

We investigate existing aggregation approaches for adapting LSR to longer documents and find that proximal scoring is crucial for LSR to handle long documents.

Language Modelling Masked Language Modeling +1

Overview of the TREC 2022 NeuCLIR Track

no code implementations24 Apr 2023 Dawn Lawrie, Sean MacAvaney, James Mayfield, Paul McNamee, Douglas W. Oard, Luca Soldaini, Eugene Yang

This is the first year of the TREC Neural CLIR (NeuCLIR) track, which aims to study the impact of neural approaches to cross-language information retrieval.

Information Retrieval Retrieval

A Unified Framework for Learned Sparse Retrieval

1 code implementation23 Mar 2023 Thong Nguyen, Sean MacAvaney, Andrew Yates

We then reproduce all prominent methods using a common codebase and re-train them in the same environment, which allows us to quantify how components of the framework affect effectiveness and efficiency.

Retrieval

One-Shot Labeling for Automatic Relevance Estimation

1 code implementation22 Feb 2023 Sean MacAvaney, Luca Soldaini

We then explore various approaches for predicting the relevance of unjudged documents with respect to a query and the known relevant document, including nearest neighbor, supervised, and prompting techniques.

Retrieval

Doc2Query--: When Less is More

1 code implementation9 Jan 2023 Mitko Gospodinov, Sean MacAvaney, Craig Macdonald

Doc2Query -- the process of expanding the content of a document before indexing using a sequence-to-sequence model -- has emerged as a prominent technique for improving the first-stage retrieval effectiveness of search engines.

Hallucination Retrieval

Adaptive Re-Ranking with a Corpus Graph

1 code implementation18 Aug 2022 Sean MacAvaney, Nicola Tonellotto, Craig Macdonald

Search systems often employ a re-ranking pipeline, wherein documents (or passages) from an initial pool of candidates are assigned new ranking scores.

Passage Ranking Re-Ranking +1

CODEC: Complex Document and Entity Collection

2 code implementations9 May 2022 Iain Mackie, Paul Owoicho, Carlos Gemmell, Sophie Fischer, Sean MacAvaney, Jeffrey Dalton

We also show that the manual query reformulations significantly improve document ranking and entity ranking performance.

Document Ranking Re-Ranking +1

On Survivorship Bias in MS MARCO

1 code implementation27 Apr 2022 Prashansa Gupta, Sean MacAvaney

We observe that this bias could be present in the popular MS MARCO dataset, given that annotators could not find answers to 38--45% of the queries, leading to these queries being discarded in training and evaluation processes.

valid

Reproducing Personalised Session Search over the AOL Query Log

no code implementations21 Jan 2022 Sean MacAvaney, Craig Macdonald, Iadh Ounis

Given that web documents are prone to change over time, we study the differences present between a version of the corpus containing documents as they appeared in 2017 (which has been used by several recent works) and a new version we construct that includes documents close to as they appeared at the time the query log was produced (2006).

Session Search

Streamlining Evaluation with ir-measures

no code implementations26 Nov 2021 Sean MacAvaney, Craig Macdonald, Iadh Ounis

We present ir-measures, a new tool that makes it convenient to calculate a diverse set of evaluation measures used in information retrieval.

Information Retrieval Retrieval

Max-Utility Based Arm Selection Strategy For Sequential Query Recommendations

no code implementations31 Aug 2021 Shameem A. Puthiya Parambath, Christos Anagnostopoulos, Roderick Murray-Smith, Sean MacAvaney, Evangelos Zervas

We show that such a selection strategy often results in higher cumulative regret and to this end, we propose a selection strategy based on the maximum utility of the arms.

Multi-Armed Bandits

IntenT5: Search Result Diversification using Causal Language Models

no code implementations9 Aug 2021 Sean MacAvaney, Craig Macdonald, Roderick Murray-Smith, Iadh Ounis

Existing approaches often rely on massive query logs and interaction data to generate a variety of possible query intents, which then can be used to re-rank documents.

Causal Language Modeling Language Modelling +1

Goldilocks: Just-Right Tuning of BERT for Technology-Assisted Review

no code implementations3 May 2021 Eugene Yang, Sean MacAvaney, David D. Lewis, Ophir Frieder

We indeed find that the pre-trained BERT model reduces review cost by 10% to 15% in TAR workflows simulated on the RCV1-v2 newswire collection.

Active Learning Language Modelling +4

ToxCCIn: Toxic Content Classification with Interpretability

no code implementations EACL (WASSA) 2021 Tong Xiang, Sean MacAvaney, Eugene Yang, Nazli Goharian

Despite the recent successes of transformer-based models in terms of effectiveness on a variety of tasks, their decisions often remain opaque to humans.

Classification General Classification

ABNIRML: Analyzing the Behavior of Neural IR Models

2 code implementations2 Nov 2020 Sean MacAvaney, Sergey Feldman, Nazli Goharian, Doug Downey, Arman Cohan

Pretrained contextualized language models such as BERT and T5 have established a new state-of-the-art for ad-hoc search.

Language Modelling Sentence

SLEDGE-Z: A Zero-Shot Baseline for COVID-19 Literature Search

no code implementations EMNLP 2020 Sean MacAvaney, Arman Cohan, Nazli Goharian

With worldwide concerns surrounding the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), there is a rapidly growing body of scientific literature on the virus.

Re-Ranking

PARADE: Passage Representation Aggregation for Document Reranking

1 code implementation20 Aug 2020 Canjia Li, Andrew Yates, Sean MacAvaney, Ben He, Yingfei Sun

In this work, we explore strategies for aggregating relevance signals from a document's passages into a final ranking score.

Document Ranking Knowledge Distillation

Interaction Matching for Long-Tail Multi-Label Classification

no code implementations18 May 2020 Sean MacAvaney, Franck Dernoncourt, Walter Chang, Nazli Goharian, Ophir Frieder

We present an elegant and effective approach for addressing limitations in existing multi-label classification models by incorporating interaction matching, a concept shown to be useful for ad-hoc search result ranking.

Classification General Classification +1

SLEDGE: A Simple Yet Effective Baseline for COVID-19 Scientific Knowledge Search

1 code implementation5 May 2020 Sean MacAvaney, Arman Cohan, Nazli Goharian

In this work, we present a search system called SLEDGE, which utilizes SciBERT to effectively re-rank articles.

Training Curricula for Open Domain Answer Re-Ranking

1 code implementation29 Apr 2020 Sean MacAvaney, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto, Nazli Goharian, Ophir Frieder

We show that the proposed heuristics can be used to build a training curriculum that down-weights difficult samples early in the training process.

Re-Ranking

Expansion via Prediction of Importance with Contextualization

1 code implementation29 Apr 2020 Sean MacAvaney, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto, Nazli Goharian, Ophir Frieder

We also observe that the performance is additive with the current leading first-stage retrieval methods, further narrowing the gap between inexpensive and cost-prohibitive passage ranking approaches.

Language Modelling Passage Ranking +2

Ranking Significant Discrepancies in Clinical Reports

no code implementations18 Jan 2020 Sean MacAvaney, Arman Cohan, Nazli Goharian, Ross Filice

This allows medical practitioners to easily identify and learn from the reports in which their interpretation most substantially differed from that of the attending physician (who finalized the report).

Teaching a New Dog Old Tricks: Resurrecting Multilingual Retrieval Using Zero-shot Learning

1 code implementation30 Dec 2019 Sean MacAvaney, Luca Soldaini, Nazli Goharian

While billions of non-English speaking users rely on search engines every day, the problem of ad-hoc information retrieval is rarely studied for non-English languages.

Ad-Hoc Information Retrieval Information Retrieval +2

Ontology-Aware Clinical Abstractive Summarization

no code implementations14 May 2019 Sean MacAvaney, Sajad Sotudeh, Arman Cohan, Nazli Goharian, Ish Talati, Ross W. Filice

Automatically generating accurate summaries from clinical reports could save a clinician's time, improve summary coverage, and reduce errors.

Abstractive Text Summarization

A Deeper Look into Dependency-Based Word Embeddings

no code implementations NAACL 2018 Sean MacAvaney, Amir Zeldes

We investigate the effect of various dependency-based word embeddings on distinguishing between functional and domain similarity, word similarity rankings, and two downstream tasks in English.

Word Embeddings Word Similarity

Content-Based Weak Supervision for Ad-Hoc Re-Ranking

1 code implementation1 Jul 2017 Sean MacAvaney, Andrew Yates, Kai Hui, Ophir Frieder

One challenge with neural ranking is the need for a large amount of manually-labeled relevance judgments for training.

Information Retrieval Re-Ranking

Cannot find the paper you are looking for? You can Submit a new open access paper.