Search Results for author: Guido Zuccon

Found 40 papers, 23 papers with code

Selecting which Dense Retriever to use for Zero-Shot Search

no code implementations18 Sep 2023 Ekaterina Khramtsova, Shengyao Zhuang, Mahsa Baktashmotlagh, Xi Wang, Guido Zuccon

We propose the new problem of choosing which dense retrieval model to use when searching on a new collection for which no labels are available, i. e. in a zero-shot setting.

Information Retrieval Retrieval

ChatGPT Hallucinates when Attributing Answers

no code implementations17 Sep 2023 Guido Zuccon, Bevan Koopman, Razia Shaik

We find that ChatGPT provides correct or partially correct answers in about half of the cases (50. 6% of the times), but its suggested references only exist 14% of the times.

Annotating Data for Fine-Tuning a Neural Ranker? Current Active Learning Strategies are not Better than Random Selection

no code implementations12 Sep 2023 Sophia Althammer, Guido Zuccon, Sebastian Hofstätter, Suzan Verberne, Allan Hanbury

We further find that gains provided by AL strategies come at the expense of more assessments (thus higher annotation costs) and AL strategies underperform random selection when comparing effectiveness given a fixed annotation cost.

Active Learning Domain Adaptation

Generating Natural Language Queries for More Effective Systematic Review Screening Prioritisation

1 code implementation11 Sep 2023 Shuai Wang, Harrisen Scells, Martin Potthast, Bevan Koopman, Guido Zuccon

Screening prioritisation in medical systematic reviews aims to rank the set of documents retrieved by complex Boolean queries.

Natural Language Queries

An Analysis of Untargeted Poisoning Attack and Defense Methods for Federated Online Learning to Rank Systems

no code implementations4 Jul 2023 Shuyi Wang, Guido Zuccon

For this, FOLTR trains learning to rank models in an online manner -- i. e. by exploiting users' interactions with the search systems (queries, clicks), rather than labels -- and federatively -- i. e. by not aggregating interaction data in a central server for training purposes, but by training instances of a model on each user device on their own private data, and then sharing the model updates, not the data, across a set of users that have formed the federation.

Federated Learning Learning-To-Rank +1

Outcome-based Evaluation of Systematic Review Automation

no code implementations30 Jun 2023 Wojciech Kusa, Guido Zuccon, Petr Knoth, Allan Hanbury

We find that accounting for the difference in review outcomes leads to a different assessment of the quality of a system than if traditional evaluation measures were used.

Beyond CO2 Emissions: The Overlooked Impact of Water Consumption of Information Retrieval Models

1 code implementation29 Jun 2023 Guido Zuccon, Harrisen Scells, Shengyao Zhuang

As in other fields of artificial intelligence, the information retrieval community has grown interested in investigating the power consumption associated with neural models, particularly models of search.

Information Retrieval Retrieval

Exploring the Representation Power of SPLADE Models

1 code implementation29 Jun 2023 Joel Mackenzie, Shengyao Zhuang, Guido Zuccon

The SPLADE (SParse Lexical AnD Expansion) model is a highly effective approach to learned sparse retrieval, where documents are represented by term impact scores derived from large language models.


Augmenting Passage Representations with Query Generation for Enhanced Cross-Lingual Dense Retrieval

1 code implementation6 May 2023 Shengyao Zhuang, Linjun Shou, Guido Zuccon

Effective cross-lingual dense retrieval methods that rely on multilingual pre-trained language models (PLMs) need to be trained to encompass both the relevance matching task and the cross-language alignment task.

Cross-Lingual Information Retrieval Retrieval

Typos-aware Bottlenecked Pre-Training for Robust Dense Retrieval

no code implementations17 Apr 2023 Shengyao Zhuang, Linjun Shou, Jian Pei, Ming Gong, Houxing Ren, Guido Zuccon, Daxin Jiang

To address this challenge, we propose ToRoDer (TypOs-aware bottlenecked pre-training for RObust DEnse Retrieval), a novel \textit{pre-training} strategy for DRs that increases their robustness to misspelled queries while preserving their effectiveness in downstream retrieval tasks.

Language Modelling Retrieval

Dr ChatGPT, tell me what I want to hear: How prompt knowledge impacts health answer correctness

no code implementations23 Feb 2023 Guido Zuccon, Bevan Koopman

Aside from measuring the effectiveness of ChatGPT in this context, we show that the knowledge passed in the prompt can overturn the knowledge encoded in the model and this is, in our experiments, to the detriment of answer correctness.

Question Answering

Can ChatGPT Write a Good Boolean Query for Systematic Review Literature Search?

no code implementations3 Feb 2023 Shuai Wang, Harrisen Scells, Bevan Koopman, Guido Zuccon

The ability of ChatGPT to follow complex instructions and generate queries with high precision makes it a valuable tool for researchers conducting systematic reviews, particularly for rapid reviews where time is a constraint and often trading-off higher precision for lower recall is acceptable.

AgAsk: An Agent to Help Answer Farmer's Questions From Scientific Documents

1 code implementation21 Dec 2022 Bevan Koopman, Ahmed Mourad, Hang Li, Anton van der Vegt, Shengyao Zhuang, Simon Gibson, Yash Dang, David Lawrence, Guido Zuccon

On the basis of these needs we release an information retrieval test collection comprising real questions, a large collection of scientific documents split in passages, and ground truth relevance assessments indicating which passages are relevant to each question.

Information Retrieval Retrieval

MeSH Suggester: A Library and System for MeSH Term Suggestion for Systematic Review Boolean Query Construction

1 code implementation18 Dec 2022 Shuai Wang, Hang Li, Guido Zuccon

One challenge to creating an effective systematic review Boolean query is the selection of effective MeSH Terms to include in the query.

Dependency-aware Self-training for Entity Alignment

1 code implementation29 Nov 2022 Bing Liu, Tiancheng Lan, Wen Hua, Guido Zuccon

Entity Alignment (EA), which aims to detect entity mappings (i. e. equivalent entity pairs) in different Knowledge Graphs (KGs), is critical for KG fusion.

Entity Alignment Knowledge Graphs

Guiding Neural Entity Alignment with Compatibility

1 code implementation29 Nov 2022 Bing Liu, Harrisen Scells, Wen Hua, Guido Zuccon, Genghong Zhao, Xia Zhang

Making compatible predictions thus should be one of the goals of training an EA model along with fitting the labelled data: this aspect however is neglected in current methods.

Entity Alignment Knowledge Graphs

Automated MeSH Term Suggestion for Effective Query Formulation in Systematic Reviews Literature Search

1 code implementation19 Sep 2022 Shuai Wang, Harrisen Scells, Bevan Koopman, Guido Zuccon

However, identifying the correct MeSH terms to include in a query is difficult: information experts are often unfamiliar with the MeSH database and unsure about the appropriateness of MeSH terms for a query.

High-quality Task Division for Large-scale Entity Alignment

1 code implementation22 Aug 2022 Bing Liu, Wen Hua, Guido Zuccon, Genghong Zhao, Xia Zhang

To include in the EA subtasks a high proportion of the potential mappings originally present in the large EA task, we devise a counterpart discovery method that exploits the locality principle of the EA task and the power of trained EA models.

Entity Alignment Informativeness +1

Rethinking Persistent Homology for Visual Recognition

no code implementations9 Jul 2022 Ekaterina Khramtsova, Guido Zuccon, Xi Wang, Mahsa Baktashmotlagh

This paper performs a detailed analysis of the effectiveness of topological properties for image classification in various training scenarios, defined by: the number of training samples, the complexity of the training data and the complexity of the backbone network.

Image Classification

Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation

1 code implementation21 Jun 2022 Shengyao Zhuang, Houxing Ren, Linjun Shou, Jian Pei, Ming Gong, Guido Zuccon, Daxin Jiang

This problem is further exacerbated when using DSI for cross-lingual retrieval, where document text and query text are in different languages.

Passage Retrieval Retrieval

How does Feedback Signal Quality Impact Effectiveness of Pseudo Relevance Feedback for Passage Retrieval?

no code implementations12 May 2022 Hang Li, Ahmed Mourad, Bevan Koopman, Guido Zuccon

Pseudo-Relevance Feedback (PRF) assumes that the top results retrieved by a first-stage ranker are relevant to the original query and uses them to improve the query representation for a second round of retrieval.

Passage Retrieval Retrieval

To Interpolate or not to Interpolate: PRF, Dense and Sparse Retrievers

no code implementations30 Apr 2022 Hang Li, Shuai Wang, Shengyao Zhuang, Ahmed Mourad, Xueguang Ma, Jimmy Lin, Guido Zuccon

In this paper we consider the problem of combining the relevance signals from sparse and dense retrievers in the context of Pseudo Relevance Feedback (PRF).

Information Retrieval Language Modelling +1

Is Non-IID Data a Threat in Federated Online Learning to Rank?

1 code implementation20 Apr 2022 Shuyi Wang, Guido Zuccon

A well-known factor that affects the performance of federated learning systems, and that poses serious challenges to these approaches, is that there may be some type of bias in the way data is distributed across clients.

Federated Learning Information Retrieval +2

CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos

1 code implementation1 Apr 2022 Shengyao Zhuang, Guido Zuccon

We then demonstrate that the root cause of this resides in the input tokenization strategy employed by BERT.

Passage Retrieval Retrieval

Implicit Feedback for Dense Passage Retrieval: A Counterfactual Approach

1 code implementation1 Apr 2022 Shengyao Zhuang, Hang Li, Guido Zuccon

We then exploit such historic implicit interactions to improve the effectiveness of a DR. A key challenge that we study is the effect that biases in the click signal, such as position bias, have on the DRs.

Passage Retrieval Retrieval

Asyncval: A Toolkit for Asynchronously Validating Dense Retriever Checkpoints during Training

1 code implementation25 Feb 2022 Shengyao Zhuang, Guido Zuccon

A simple and efficient strategy to validate deep learning checkpoints is the addition of validation loops to execute during training.

Natural Questions Passage Retrieval +1

Reinforcement Online Learning to Rank with Unbiased Reward Shaping

1 code implementation5 Jan 2022 Shengyao Zhuang, Zhihao Qiao, Guido Zuccon

Online learning to rank (OLTR) aims to learn a ranker directly from implicit feedback derived from users' interactions, such as clicks.


Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback: A Reproducibility Study

1 code implementation13 Dec 2021 Hang Li, Shengyao Zhuang, Ahmed Mourad, Xueguang Ma, Jimmy Lin, Guido Zuccon

Finally, we contribute a study of the generalisability of the ANCE-PRF method when dense retrievers other than ANCE are used for the first round of retrieval and for encoding the PRF signal.


Seed-driven Document Ranking for Systematic Reviews: A Reproducibility Study

1 code implementation8 Dec 2021 Shuai Wang, Harrisen Scells, Ahmed Mourad, Guido Zuccon

Our results also indicate that our reproduced screening prioritisation method, (1) is generalisable across datasets of similar and different topicality compared to the original implementation, (2) that when using multiple seed studies, the effectiveness of the method increases using our techniques to enable this, (3) and that the use of multiple seed studies produces more stable rankings compared to single seed studies.

Document Ranking

ActiveEA: Active Learning for Neural Entity Alignment

1 code implementation EMNLP 2021 Bing Liu, Harrisen Scells, Guido Zuccon, Wen Hua, Genghong Zhao

Entity Alignment (EA) aims to match equivalent entities across different Knowledge Graphs (KGs) and is an essential step of KG fusion.

Active Learning Entity Alignment +1

Dealing with Typos for BERT-based Passage Retrieval and Ranking

2 code implementations EMNLP 2021 Shengyao Zhuang, Guido Zuccon

Our experimental results on the MS MARCO passage ranking dataset show that, with our proposed typos-aware training, DR and BERT re-ranker can become robust to typos in queries, resulting in significantly improved effectiveness compared to models trained without appropriately accounting for typos.

Language Modelling Open-Domain Question Answering +5

Pseudo Relevance Feedback with Deep Language Models and Dense Retrievers: Successes and Pitfalls

1 code implementation25 Aug 2021 Hang Li, Ahmed Mourad, Shengyao Zhuang, Bevan Koopman, Guido Zuccon

Text-based PRF results show that the use of PRF had a mixed effect on deep rerankers across different datasets.


Fast Passage Re-ranking with Contextualized Exact Term Matching and Efficient Passage Expansion

1 code implementation19 Aug 2021 Shengyao Zhuang, Guido Zuccon

BERT-based information retrieval models are expensive, in both time (query latency) and computational resources (energy, hardware cost), making many of these models impractical especially under resource constraints.

Information Retrieval Passage Re-Ranking +2

The Benefits of Word Embeddings Features for Active Learning in Clinical Information Extraction

no code implementations ALTA 2016 Mahnoosh Kholghi, Lance De Vine, Laurianne Sitbon, Guido Zuccon, Anthony Nguyen

This study investigates the use of unsupervised word embeddings and sequence features for sample representation in an active learning framework built to extract clinical concepts from clinical free text.

Active Learning Informativeness +1

Building Evaluation Datasets for Consumer-Oriented Information Retrieval

no code implementations LREC 2016 Lorraine Goeuriot, Liadh Kelly, Guido Zuccon, Joao Palotti

In this paper we present the datasets created by CLEF eHealth Lab from 2013-2015 for evaluation of search solutions to support common people finding health information online.

Information Retrieval Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.