no code implementations • 25 Nov 2022 • Ophir Frieder, Ida Mele, Cristina Ioana Muntean, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto
Our achieved high cache hit rates significantly improve the responsiveness of conversational systems while likewise reducing the number of queries managed on the search back-end.
1 code implementation • 18 Aug 2022 • Sean MacAvaney, Nicola Tonellotto, Craig Macdonald
Search systems often employ a re-ranking pipeline, wherein documents (or passages) from an initial pool of candidates are assigned new ranking scores.
no code implementations • 27 Jul 2022 • Nicola Tonellotto
These lecture notes focus on the recent advancements in neural information retrieval, with particular emphasis on the systems and models exploiting transformer networks.
1 code implementation • 24 Apr 2022 • Antonio Mallia, Joel Mackenzie, Torsten Suel, Nicola Tonellotto
Neural information retrieval architectures based on transformers such as BERT are able to significantly improve system effectiveness over traditional sparse models such as BM25.
1 code implementation • 25 Aug 2021 • Craig Macdonald, Nicola Tonellotto
In this work, we investigate the use of ANN scores for ranking the candidate documents, in order to decrease the number of candidate documents being fully scored.
1 code implementation • 23 Aug 2021 • Nicola Tonellotto, Craig Macdonald
Recent advances in dense retrieval techniques have offered the promise of being able not just to re-rank documents using contextualised language models such as BERT, but also to use such models to identify documents from the collection in the first place.
1 code implementation • 13 Aug 2021 • Craig Macdonald, Nicola Tonellotto, Iadh Ounis
The advent of contextualised language models has brought gains in search effectiveness, not just when applied for re-ranking the output of classical weighting models such as BM25, but also when used directly for passage indexing and retrieval, a technique which is called dense retrieval.
3 code implementations • 21 Jun 2021 • Xiao Wang, Craig Macdonald, Nicola Tonellotto, Iadh Ounis
In particular, based on the pseudo-relevant set of documents identified using a first-pass dense retrieval, we extract representative feedback embeddings (using KMeans clustering) -- while ensuring that these embeddings discriminate among passages (based on IDF) -- which are then added to the query representation.
Ranked #1 on
TREC 2019 Passage Ranking
on MSMARCO
1 code implementation • 24 Apr 2021 • Antonio Mallia, Omar Khattab, Nicola Tonellotto, Torsten Suel
Neural information retrieval systems typically use a cascading pipeline, in which a first-stage model retrieves a candidate set of documents and one or more subsequent stages re-rank this set using contextualized language models such as BERT.
8 code implementations • 28 Jul 2020 • Craig Macdonald, Nicola Tonellotto
The advent of deep machine learning platforms such as Tensorflow and Pytorch, developed in expressive high-level languages such as Python, have allowed more expressive representations of deep neural network architectures.
1 code implementation • 29 Apr 2020 • Sean MacAvaney, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto, Nazli Goharian, Ophir Frieder
We also observe that the performance is additive with the current leading first-stage retrieval methods, further narrowing the gap between inexpensive and cost-prohibitive passage ranking approaches.
1 code implementation • 29 Apr 2020 • Sean MacAvaney, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto, Nazli Goharian, Ophir Frieder
Deep pretrained transformer networks are effective at various ranking tasks, such as question answering and ad-hoc document ranking.
1 code implementation • 29 Apr 2020 • Sean MacAvaney, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto, Nazli Goharian, Ophir Frieder
We show that the proposed heuristics can be used to build a training curriculum that down-weights difficult samples early in the training process.
no code implementations • 9 Jan 2020 • Ida Mele, Nicola Tonellotto, Ophir Frieder, Raffaele Perego
The results of queries characterized by a topic are kept in the fraction of the cache dedicated to it.