no code implementations • 22 Apr 2024 • Thibault Formal, Stéphane Clinchant, Hervé Déjean, Carlos Lassance
The late interaction paradigm introduced with ColBERT stands out in the neural Information Retrieval space, offering a compelling effectiveness-efficiency trade-off across many benchmarks.
no code implementations • 20 Apr 2024 • Carlos Lassance, Hervé Dejean, Stéphane Clinchant, Nicola Tonellotto
Learned sparse models such as SPLADE have successfully shown how to incorporate the benefits of state-of-the-art neural information retrieval models into the classical inverted index data structure.
no code implementations • 11 Mar 2024 • Carlos Lassance, Hervé Déjean, Thibault Formal, Stéphane Clinchant
A companion to the release of the latest version of the SPLADE library.
no code implementations • 30 Nov 2023 • Haonan Chen, Carlos Lassance, Jimmy Lin
The bi-encoder architecture provides a framework for understanding machine-learned retrieval models based on dense and sparse vector representations.
2 code implementations • 13 Jun 2023 • Ehsan Kamalloo, Nandan Thakur, Carlos Lassance, Xueguang Ma, Jheng-Hong Yang, Jimmy Lin
BEIR is a benchmark dataset for zero-shot evaluation of information retrieval models across 18 different domain/task combinations.
1 code implementation • 5 Jun 2023 • Hervé Déjean, Stéphane Clinchant, Carlos Lassance, Simon Lupart, Thibault Formal
We compare both dense and sparse approaches under various finetuning protocols and middle training on different collections (MS MARCO, Wikipedia or Tripclick).
no code implementations • 25 Apr 2023 • Carlos Lassance, Simon Lupart, Hervé Dejean, Stéphane Clinchant, Nicola Tonellotto
Sparse neural retrievers, such as DeepImpact, uniCOIL and SPLADE, have been introduced recently as an efficient and effective way to perform retrieval with inverted indexes.
no code implementations • 25 Apr 2023 • Carlos Lassance, Stéphane Clinchant
This is why this paper aims to report the importance of this issue so that researchers can be made aware of this problem and appropriately report their results.
2 code implementations • 4 Apr 2023 • Jheng-Hong Yang, Carlos Lassance, Rafael Sampaio de Rezende, Krishna Srinivasan, Miriam Redi, Stéphane Clinchant, Jimmy Lin
This paper presents the AToMiC (Authoring Tools for Multimedia Content) dataset, designed to advance research in image/text cross-modal retrieval.
no code implementations • 3 Apr 2023 • Jimmy Lin, David Alfonso-Hermelo, Vitor Jeronymo, Ehsan Kamalloo, Carlos Lassance, Rodrigo Nogueira, Odunayo Ogundepo, Mehdi Rezagholizadeh, Nandan Thakur, Jheng-Hong Yang, Xinyu Zhang
The advent of multilingual language models has generated a resurgence of interest in cross-lingual information retrieval (CLIR), which is the task of searching documents in one language with queries from another.
1 code implementation • 23 Mar 2023 • Vaishali Pal, Carlos Lassance, Hervé Déjean, Stéphane Clinchant
While previous studies have only experimented with dense retriever or in a cross lingual retrieval scenario, in this paper we aim to complete the picture on the use of adapters in IR.
no code implementations • 10 Mar 2023 • Carlos Lassance, Stéphane Clinchant
This paper describes our participation in the 2022 TREC NeuCLIR challenge.
no code implementations • 28 Feb 2023 • Carlos Lassance
This paper describes our participation in the 2023 WSDM CUP - MIRACL challenge.
no code implementations • 24 Feb 2023 • Carlos Lassance, Stéphane Clinchant
This paper describes our participation to the 2022 TREC Deep Learning challenge.
no code implementations • 25 Jan 2023 • Carlos Lassance, Hervé Déjean, Stéphane Clinchant
In this paper, we study the impact of the pretraining collection on the final IR effectiveness.
1 code implementation • 8 Jul 2022 • Carlos Lassance, Stéphane Clinchant
SPLADE efficiency can be controlled via a regularization factor, but solely controlling this regularization has been shown to not be efficient enough.
1 code implementation • 10 May 2022 • Thibault Formal, Carlos Lassance, Benjamin Piwowarski, Stéphane Clinchant
Neural retrievers based on dense representations combined with Approximate Nearest Neighbors search have recently received a lot of attention, owing their success to distillation and/or better sampling of examples for training -- while still relying on the same backbone architecture.
no code implementations • 14 Apr 2022 • Carlos Lassance, Thibault Formal, Stephane Clinchant
Second, CCSA can be used as a binary quantization method and we propose to combine it with the recent graph based ANN techniques.
no code implementations • 13 Dec 2021 • Carlos Lassance, Maroua Maachou, Joohee Park, Stéphane Clinchant
Our experiments show that ColBERT indexes can be pruned up to 30\% on the MS MARCO passage collection without a significant drop in performance.
1 code implementation • 18 Oct 2021 • Yannis Kalantidis, Carlos Lassance, Jon Almazan, Diane Larlus
Dimensionality reduction methods are unsupervised approaches which learn low-dimensional spaces where some properties of the initial space, typically the notion of "neighborhood", are preserved.
no code implementations • 8 Oct 2021 • Carlos Lassance, Myriam Bontonou, Mounia Hamidouche, Bastien Pasdeloup, Lucas Drumetz, Vincent Gripon
This chapter is composed of four main parts: tools for visualizing intermediate layers in a DNN, denoising data representations, optimizing graph objective functions and regularizing the learning process.
1 code implementation • 21 Sep 2021 • Thibault Formal, Carlos Lassance, Benjamin Piwowarski, Stéphane Clinchant
Meanwhile, there has been a growing interest in learning \emph{sparse} representations for documents and queries, that could inherit from the desirable properties of bag-of-words models such as the exact matching of terms and the efficiency of inverted indexes.
Ranked #5 on Zero-shot Text Search on BEIR
no code implementations • 12 Jan 2021 • Mounia Hamidouche, Carlos Lassance, Yuqing Hu, Lucas Drumetz, Bastien Pasdeloup, Vincent Gripon
In machine learning, classifiers are typically susceptible to noise in the training data.
no code implementations • 14 Dec 2020 • Carlos Lassance
In recent years, Deep Learning methods have achieved state of the art performance in a vast range of machine learning tasks, including image classification and multilingual automatic text translation.
no code implementations • 2 Dec 2020 • Vincent Gripon, Carlos Lassance, Ghouthi Boukli Hacene
Learning deep representations to solve complex machine learning tasks has become the prominent trend in the past few years.
1 code implementation • 25 Nov 2020 • Carlos Lassance, Louis Béthune, Myriam Bontonou, Mounia Hamidouche, Vincent Gripon
Measuring the generalization performance of a Deep Neural Network (DNN) without relying on a validation set is a difficult task.
no code implementations • 14 Nov 2020 • Carlos Lassance, Vincent Gripon, Antonio Ortega
However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought.
1 code implementation • 16 Jul 2020 • Carlos Lassance, Vincent Gripon, Gonzalo Mateos
Graphs are nowadays ubiquitous in the fields of signal processing and machine learning.
1 code implementation • 8 Nov 2019 • Carlos Lassance, Myriam Bontonou, Ghouthi Boukli Hacene, Vincent Gripon, Jian Tang, Antonio Ortega
Specifically we introduce a graph-based RKD method, in which graphs are used to capture the geometry of latent spaces.
no code implementations • 7 Nov 2019 • Carlos Lassance, Yasir Latif, Ravi Garg, Vincent Gripon, Ian Reid
One solution to this problem is to learn a deep neural network to infer the pose of a query image after learning on a dataset of images with known poses.
no code implementations • 11 Sep 2019 • Carlos Lassance, Vincent Gripon, Jian Tang, Antonio Ortega
Deep Networks have been shown to provide state-of-the-art performance in many machine learning challenges.
no code implementations • 19 Aug 2019 • Myriam Bontonou, Carlos Lassance, Vincent Gripon, Nicolas Farrugia
Predicting the future of Graph-supported Time Series (GTS) is a key challenge in many domains, such as climate monitoring, finance or neuroimaging.
1 code implementation • 29 May 2019 • Ghouthi Boukli Hacene, Carlos Lassance, Vincent Gripon, Matthieu Courbariaux, Yoshua Bengio
In many application domains such as computer vision, Convolutional Layers (CLs) are key to the accuracy of deep learning methods.
no code implementations • 1 May 2019 • Myriam Bontonou, Carlos Lassance, Ghouthi Boukli Hacene, Vincent Gripon, Jian Tang, Antonio Ortega
We introduce a novel loss function for training deep learning architectures to perform classification.
no code implementations • 1 May 2019 • Myriam Bontonou, Carlos Lassance, Jean-Charles Vialatte, Vincent Gripon
Convolutional Neural Networks are very efficient at processing signals defined on a discrete Euclidean space (such as images).