Search Results for author: Guido Zuccon

Found 53 papers, 30 papers with code

Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders

2 code implementations • 10 Apr 2024 • Ferdinand Schlatt, Maik Fröbe, Harrisen Scells, Shengyao Zhuang, Bevan Koopman, Guido Zuccon, Benno Stein, Martin Potthast, Matthias Hagen

Cross-encoders are effective passage re-rankers.

Passage Re-Ranking Re-Ranking

144

Paper
Code

Understanding and Mitigating the Threat of Vec2Text to Dense Retrieval Systems

1 code implementation • 20 Feb 2024 • Shengyao Zhuang, Bevan Koopman, Xiaoran Chu, Guido Zuccon

In this paper, we investigate various aspects of embedding models that could influence the recoverability of text using Vec2Text.

Quantization Retrieval

Paper
Code

FeB4RAG: Evaluating Federated Search in the Context of Retrieval Augmented Generation

no code implementations • 19 Feb 2024 • Shuai Wang, Ekaterina Khramtsova, Shengyao Zhuang, Guido Zuccon

Federated search systems aggregate results from multiple search engines, selecting appropriate sources to enhance result quality and align with user intent.

Benchmarking Chatbot +3

Paper
Add Code

Large Language Models for Stemming: Promises, Pitfalls and Failures

no code implementations • 19 Feb 2024 • Shuai Wang, Shengyao Zhuang, Guido Zuccon

With this respect, we identify three avenues, each characterised by different trade-offs in terms of computational cost, effectiveness and robustness : (1) use LLMs to stem the vocabulary for a collection, i. e., the set of unique words that appear in the collection (vocabulary stemming), (2) use LLMs to stem each document separately (contextual stemming), and (3) use LLMs to extract from each document entities that should not be stemmed, then use vocabulary stemming to stem the rest of the terms (entity-based contextual stemming).

Paper
Add Code

Leveraging LLMs for Unsupervised Dense Retriever Ranking

no code implementations • 7 Feb 2024 • Ekaterina Khramtsova, Shengyao Zhuang, Mahsa Baktashmotlagh, Guido Zuccon

Existing methodologies for ranking dense retrievers fall short in addressing these domain shift scenarios.

Paper
Add Code

ReSLLM: Large Language Models are Strong Resource Selectors for Federated Search

no code implementations • 31 Jan 2024 • Shuai Wang, Shengyao Zhuang, Bevan Koopman, Guido Zuccon

Our ReSLLM method exploits LLMs to drive the selection of resources in federated search in a zero-shot setting.

Paper
Add Code

TPRF: A Transformer-based Pseudo-Relevance Feedback Model for Efficient and Effective Retrieval

no code implementations • 24 Jan 2024 • Chuting Yu, Hang Li, Ahmed Mourad, Bevan Koopman, Guido Zuccon

This paper considers Pseudo-Relevance Feedback (PRF) methods for dense retrievers in a resource constrained environment such as that of cheap cloud instances or embedded systems (e. g., smartphones and smartwatches), where memory and CPU are limited and GPUs are not present.

Retrieval

Paper
Add Code

How to Forget Clients in Federated Online Learning to Rank?

1 code implementation • 24 Jan 2024 • Shuyi Wang, Bing Liu, Guido Zuccon

In a FOLTR system, a ranker is learned by aggregating local updates to the global ranking model.

Learning-To-Rank

Paper
Code

A Reproducibility Study of Goldilocks: Just-Right Tuning of BERT for TAR

1 code implementation • 16 Jan 2024 • Xinyu Mao, Bevan Koopman, Guido Zuccon

In this context, we show that there is no need for further pre-training if a domain-specific BERT backbone is used within the active learning pipeline.

Active Learning TAR +2

Paper
Code

Zero-shot Generative Large Language Models for Systematic Review Screening Automation

no code implementations • 12 Jan 2024 • Shuai Wang, Harrisen Scells, Shengyao Zhuang, Martin Potthast, Bevan Koopman, Guido Zuccon

Systematic reviews are crucial for evidence-based medicine as they comprehensively analyse published research findings on specific questions.

Paper
Add Code

Team IELAB at TREC Clinical Trial Track 2023: Enhancing Clinical Trial Retrieval with Neural Rankers and Large Language Models

no code implementations • 3 Jan 2024 • Shengyao Zhuang, Bevan Koopman, Guido Zuccon

We describe team ielab from CSIRO and The University of Queensland's approach to the 2023 TREC Clinical Trials Track.

Retrieval

Paper
Add Code

Open-source Large Language Models are Strong Zero-shot Query Likelihood Models for Document Ranking

1 code implementation • 20 Oct 2023 • Shengyao Zhuang, Bing Liu, Bevan Koopman, Guido Zuccon

In the field of information retrieval, Query Likelihood Models (QLMs) rank documents based on the probability of generating the query given the content of a document.

Document Ranking Information Retrieval +3

Paper
Code

A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language Models

1 code implementation • 14 Oct 2023 • Shengyao Zhuang, Honglei Zhuang, Bevan Koopman, Guido Zuccon

Our approach reduces the number of LLM inferences and the amount of prompt token consumption during the ranking procedure, significantly improving the efficiency of LLM-based zero-shot ranking.

Document Ranking

Paper
Code

Selecting which Dense Retriever to use for Zero-Shot Search

no code implementations • 18 Sep 2023 • Ekaterina Khramtsova, Shengyao Zhuang, Mahsa Baktashmotlagh, Xi Wang, Guido Zuccon

We propose the new problem of choosing which dense retrieval model to use when searching on a new collection for which no labels are available, i. e. in a zero-shot setting.

Information Retrieval Retrieval

Paper
Add Code

ChatGPT Hallucinates when Attributing Answers

no code implementations • 17 Sep 2023 • Guido Zuccon, Bevan Koopman, Razia Shaik

We find that ChatGPT provides correct or partially correct answers in about half of the cases (50. 6% of the times), but its suggested references only exist 14% of the times.

Attribute

Paper
Add Code

Annotating Data for Fine-Tuning a Neural Ranker? Current Active Learning Strategies are not Better than Random Selection

no code implementations • 12 Sep 2023 • Sophia Althammer, Guido Zuccon, Sebastian Hofstätter, Suzan Verberne, Allan Hanbury

We further find that gains provided by AL strategies come at the expense of more assessments (thus higher annotation costs) and AL strategies underperform random selection when comparing effectiveness given a fixed annotation cost.

Active Learning Domain Adaptation

Paper
Add Code

Generating Natural Language Queries for More Effective Systematic Review Screening Prioritisation

1 code implementation • 11 Sep 2023 • Shuai Wang, Harrisen Scells, Martin Potthast, Bevan Koopman, Guido Zuccon

Our best approach is not only viable based on the information available at the time of screening, but also has similar effectiveness to the final title.

Natural Language Queries

Paper
Code

An Analysis of Untargeted Poisoning Attack and Defense Methods for Federated Online Learning to Rank Systems

no code implementations • 4 Jul 2023 • Shuyi Wang, Guido Zuccon

For this, FOLTR trains learning to rank models in an online manner -- i. e. by exploiting users' interactions with the search systems (queries, clicks), rather than labels -- and federatively -- i. e. by not aggregating interaction data in a central server for training purposes, but by training instances of a model on each user device on their own private data, and then sharing the model updates, not the data, across a set of users that have formed the federation.

Federated Learning Learning-To-Rank +1

Paper
Add Code

Outcome-based Evaluation of Systematic Review Automation

no code implementations • 30 Jun 2023 • Wojciech Kusa, Guido Zuccon, Petr Knoth, Allan Hanbury

We find that accounting for the difference in review outcomes leads to a different assessment of the quality of a system than if traditional evaluation measures were used.

TAR

Paper
Add Code

Exploring the Representation Power of SPLADE Models

1 code implementation • 29 Jun 2023 • Joel Mackenzie, Shengyao Zhuang, Guido Zuccon

The SPLADE (SParse Lexical AnD Expansion) model is a highly effective approach to learned sparse retrieval, where documents are represented by term impact scores derived from large language models.

Retrieval

Paper
Code

Beyond CO2 Emissions: The Overlooked Impact of Water Consumption of Information Retrieval Models

1 code implementation • 29 Jun 2023 • Guido Zuccon, Harrisen Scells, Shengyao Zhuang

As in other fields of artificial intelligence, the information retrieval community has grown interested in investigating the power consumption associated with neural models, particularly models of search.

Information Retrieval Retrieval

Paper
Code

Augmenting Passage Representations with Query Generation for Enhanced Cross-Lingual Dense Retrieval

1 code implementation • 6 May 2023 • Shengyao Zhuang, Linjun Shou, Guido Zuccon

Effective cross-lingual dense retrieval methods that rely on multilingual pre-trained language models (PLMs) need to be trained to encompass both the relevance matching task and the cross-language alignment task.

Cross-Lingual Information Retrieval Retrieval

Paper
Code

Typos-aware Bottlenecked Pre-Training for Robust Dense Retrieval

1 code implementation • 17 Apr 2023 • Shengyao Zhuang, Linjun Shou, Jian Pei, Ming Gong, Houxing Ren, Guido Zuccon, Daxin Jiang

To address this challenge, we propose ToRoDer (TypOs-aware bottlenecked pre-training for RObust DEnse Retrieval), a novel re-training strategy for DRs that increases their robustness to misspelled queries while preserving their effectiveness in downstream retrieval tasks.

Language Modelling Retrieval

Paper
Code

Dr ChatGPT, tell me what I want to hear: How prompt knowledge impacts health answer correctness

no code implementations • 23 Feb 2023 • Guido Zuccon, Bevan Koopman

Aside from measuring the effectiveness of ChatGPT in this context, we show that the knowledge passed in the prompt can overturn the knowledge encoded in the model and this is, in our experiments, to the detriment of answer correctness.

Question Answering

Paper
Add Code

Can ChatGPT Write a Good Boolean Query for Systematic Review Literature Search?

no code implementations • 3 Feb 2023 • Shuai Wang, Harrisen Scells, Bevan Koopman, Guido Zuccon

The ability of ChatGPT to follow complex instructions and generate queries with high precision makes it a valuable tool for researchers conducting systematic reviews, particularly for rapid reviews where time is a constraint and often trading-off higher precision for lower recall is acceptable.

Paper
Add Code

AgAsk: An Agent to Help Answer Farmer's Questions From Scientific Documents

1 code implementation • 21 Dec 2022 • Bevan Koopman, Ahmed Mourad, Hang Li, Anton van der Vegt, Shengyao Zhuang, Simon Gibson, Yash Dang, David Lawrence, Guido Zuccon

On the basis of these needs we release an information retrieval test collection comprising real questions, a large collection of scientific documents split in passages, and ground truth relevance assessments indicating which passages are relevant to each question.

Information Retrieval Retrieval

Paper
Code

Neural Rankers for Effective Screening Prioritisation in Medical Systematic Review Literature Search

no code implementations • 18 Dec 2022 • Shuai Wang, Harrisen Scells, Bevan Koopman, Guido Zuccon

An empirical analysis compares how effective neural methods compare to traditional methods for this task.

Document Ranking

Paper
Add Code

MeSH Suggester: A Library and System for MeSH Term Suggestion for Systematic Review Boolean Query Construction

1 code implementation • 18 Dec 2022 • Shuai Wang, Hang Li, Guido Zuccon

One challenge to creating an effective systematic review Boolean query is the selection of effective MeSH Terms to include in the query.

Paper
Code

Guiding Neural Entity Alignment with Compatibility

1 code implementation • 29 Nov 2022 • Bing Liu, Harrisen Scells, Wen Hua, Guido Zuccon, Genghong Zhao, Xia Zhang

Making compatible predictions thus should be one of the goals of training an EA model along with fitting the labelled data: this aspect however is neglected in current methods.

Entity Alignment Knowledge Graphs

Paper
Code

Dependency-aware Self-training for Entity Alignment

1 code implementation • 29 Nov 2022 • Bing Liu, Tiancheng Lan, Wen Hua, Guido Zuccon

Entity Alignment (EA), which aims to detect entity mappings (i. e. equivalent entity pairs) in different Knowledge Graphs (KGs), is critical for KG fusion.

Entity Alignment Knowledge Graphs

Paper
Code

Automated MeSH Term Suggestion for Effective Query Formulation in Systematic Reviews Literature Search

1 code implementation • 19 Sep 2022 • Shuai Wang, Harrisen Scells, Bevan Koopman, Guido Zuccon

However, identifying the correct MeSH terms to include in a query is difficult: information experts are often unfamiliar with the MeSH database and unsure about the appropriateness of MeSH terms for a query.

Paper
Code

High-quality Task Division for Large-scale Entity Alignment

1 code implementation • 22 Aug 2022 • Bing Liu, Wen Hua, Guido Zuccon, Genghong Zhao, Xia Zhang

To include in the EA subtasks a high proportion of the potential mappings originally present in the large EA task, we devise a counterpart discovery method that exploits the locality principle of the EA task and the power of trained EA models.

Entity Alignment Informativeness +1

Paper
Code

Rethinking Persistent Homology for Visual Recognition

no code implementations • 9 Jul 2022 • Ekaterina Khramtsova, Guido Zuccon, Xi Wang, Mahsa Baktashmotlagh

This paper performs a detailed analysis of the effectiveness of topological properties for image classification in various training scenarios, defined by: the number of training samples, the complexity of the training data and the complexity of the backbone network.

Image Classification

Paper
Add Code

Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation

1 code implementation • 21 Jun 2022 • Shengyao Zhuang, Houxing Ren, Linjun Shou, Jian Pei, Ming Gong, Guido Zuccon, Daxin Jiang

This problem is further exacerbated when using DSI for cross-lingual retrieval, where document text and query text are in different languages.

Passage Retrieval Retrieval

Paper
Code

How does Feedback Signal Quality Impact Effectiveness of Pseudo Relevance Feedback for Passage Retrieval?

no code implementations • 12 May 2022 • Hang Li, Ahmed Mourad, Bevan Koopman, Guido Zuccon

Pseudo-Relevance Feedback (PRF) assumes that the top results retrieved by a first-stage ranker are relevant to the original query and uses them to improve the query representation for a second round of retrieval.

Passage Retrieval Retrieval

Paper
Add Code

To Interpolate or not to Interpolate: PRF, Dense and Sparse Retrievers

no code implementations • 30 Apr 2022 • Hang Li, Shuai Wang, Shengyao Zhuang, Ahmed Mourad, Xueguang Ma, Jimmy Lin, Guido Zuccon

In this paper we consider the problem of combining the relevance signals from sparse and dense retrievers in the context of Pseudo Relevance Feedback (PRF).

Information Retrieval Language Modelling +1

Paper
Add Code

Is Non-IID Data a Threat in Federated Online Learning to Rank?

1 code implementation • 20 Apr 2022 • Shuyi Wang, Guido Zuccon

A well-known factor that affects the performance of federated learning systems, and that poses serious challenges to these approaches, is that there may be some type of bias in the way data is distributed across clients.

Federated Learning Information Retrieval +2

Paper
Code

From Little Things Big Things Grow: A Collection with Seed Studies for Medical Systematic Review Literature Search

1 code implementation • 6 Apr 2022 • Shuai Wang, Harrisen Scells, Justin Clark, Bevan Koopman, Guido Zuccon

However, we show pseudo seed studies are not representative of real seed studies used by information specialists.

Retrieval

Paper
Code

CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos

1 code implementation • 1 Apr 2022 • Shengyao Zhuang, Guido Zuccon

We then demonstrate that the root cause of this resides in the input tokenization strategy employed by BERT.

Passage Retrieval Retrieval

Paper
Code

Implicit Feedback for Dense Passage Retrieval: A Counterfactual Approach

1 code implementation • 1 Apr 2022 • Shengyao Zhuang, Hang Li, Guido Zuccon

We then exploit such historic implicit interactions to improve the effectiveness of a DR. A key challenge that we study is the effect that biases in the click signal, such as position bias, have on the DRs.

counterfactual Passage Retrieval +2

Paper
Code

Asyncval: A Toolkit for Asynchronously Validating Dense Retriever Checkpoints during Training

1 code implementation • 25 Feb 2022 • Shengyao Zhuang, Guido Zuccon

A simple and efficient strategy to validate deep learning checkpoints is the addition of validation loops to execute during training.

Natural Questions Passage Retrieval +1

Paper
Code

Case law retrieval: problems, methods, challenges and evaluations in the last 20 years

no code implementations • 15 Feb 2022 • Daniel Locke, Guido Zuccon

Case law retrieval is the retrieval of judicial decisions relevant to a legal question.

Information Retrieval Recommendation Systems +1

Paper
Add Code

Reinforcement Online Learning to Rank with Unbiased Reward Shaping

1 code implementation • 5 Jan 2022 • Shengyao Zhuang, Zhihao Qiao, Guido Zuccon

Online learning to rank (OLTR) aims to learn a ranker directly from implicit feedback derived from users' interactions, such as clicks.

Learning-To-Rank Position

Paper
Code

Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback: A Reproducibility Study

1 code implementation • 13 Dec 2021 • Hang Li, Shengyao Zhuang, Ahmed Mourad, Xueguang Ma, Jimmy Lin, Guido Zuccon

Finally, we contribute a study of the generalisability of the ANCE-PRF method when dense retrievers other than ANCE are used for the first round of retrieval and for encoding the PRF signal.

Retrieval

Paper
Code

Seed-driven Document Ranking for Systematic Reviews: A Reproducibility Study

1 code implementation • 8 Dec 2021 • Shuai Wang, Harrisen Scells, Ahmed Mourad, Guido Zuccon

Our results also indicate that our reproduced screening prioritisation method, (1) is generalisable across datasets of similar and different topicality compared to the original implementation, (2) that when using multiple seed studies, the effectiveness of the method increases using our techniques to enable this, (3) and that the use of multiple seed studies produces more stable rankings compared to single seed studies.

Document Ranking

Paper
Code

ActiveEA: Active Learning for Neural Entity Alignment

1 code implementation • EMNLP 2021 • Bing Liu, Harrisen Scells, Guido Zuccon, Wen Hua, Genghong Zhao

Entity Alignment (EA) aims to match equivalent entities across different Knowledge Graphs (KGs) and is an essential step of KG fusion.

Active Learning Entity Alignment +1

Paper
Code

Dealing with Typos for BERT-based Passage Retrieval and Ranking

2 code implementations • EMNLP 2021 • Shengyao Zhuang, Guido Zuccon

Our experimental results on the MS MARCO passage ranking dataset show that, with our proposed typos-aware training, DR and BERT re-ranker can become robust to typos in queries, resulting in significantly improved effectiveness compared to models trained without appropriately accounting for typos.

Language Modelling Open-Domain Question Answering +5

Paper
Code

Pseudo Relevance Feedback with Deep Language Models and Dense Retrievers: Successes and Pitfalls

1 code implementation • 25 Aug 2021 • Hang Li, Ahmed Mourad, Shengyao Zhuang, Bevan Koopman, Guido Zuccon

Text-based PRF results show that the use of PRF had a mixed effect on deep rerankers across different datasets.

Retrieval

Paper
Code

Fast Passage Re-ranking with Contextualized Exact Term Matching and Efficient Passage Expansion

1 code implementation • 19 Aug 2021 • Shengyao Zhuang, Guido Zuccon

BERT-based information retrieval models are expensive, in both time (query latency) and computational resources (energy, hardware cost), making many of these models impractical especially under resource constraints.

Information Retrieval Passage Re-Ranking +2

Paper
Code

The Benefits of Word Embeddings Features for Active Learning in Clinical Information Extraction

no code implementations • ALTA 2016 • Mahnoosh Kholghi, Lance De Vine, Laurianne Sitbon, Guido Zuccon, Anthony Nguyen

This study investigates the use of unsupervised word embeddings and sequence features for sample representation in an active learning framework built to extract clinical concepts from clinical free text.

Active Learning Informativeness +1

Paper
Add Code

Building Evaluation Datasets for Consumer-Oriented Information Retrieval

no code implementations • LREC 2016 • Lorraine Goeuriot, Liadh Kelly, Guido Zuccon, Joao Palotti

In this paper we present the datasets created by CLEF eHealth Lab from 2013-2015 for evaluation of search solutions to support common people finding health information online.

Information Retrieval Retrieval

Paper
Add Code

Analysis of Word Embeddings and Sequence Features for Clinical Information Extraction

no code implementations • ALTA 2015 • Lance De Vine, Mahnoosh Kholghi, Guido Zuccon, Laurianne Sitbon, Anthony Nguyen

Clinical Concept Extraction Language Modelling +1

Paper
Add Code

Semantic Judgement of Medical Concepts: Combining Syntagmatic and Paradigmatic Information with the Tensor Encoding Model

no code implementations • ALTA 2012 • Michael Symonds, Guido Zuccon, Bevan Koopman, Peter Bruza, Anthony Nguyen

Drug Discovery Information Retrieval +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.