no code implementations • JEP/TALN/RECITAL 2022 • Antoine Chaffin, Vincent Claveau, Ewa Kijak, Sylvain Lamprier, Benjamin Piwowarski, Thomas Scialom, Jacopo Staiano
Nous évaluons leurs avantages et inconvénients, en explorant leur précision respective sur des tâches de classification, ainsi que leur impact sur la génération coopérative et leur coût de calcul, dans le cadre d’une stratégie de décodage état de l’art, basée sur une recherche arborescente de Monte-Carlo (MCTS).
no code implementations • 27 Jun 2024 • Mathias Vast, Basile Van Cooten, Laure Soulier, Benjamin Piwowarski
With the recent addition of Retrieval-Augmented Generation (RAG), the scope and importance of Information Retrieval (IR) has expanded.
1 code implementation • 24 Apr 2024 • Folco Bertini Baldassini, Mustafa Shukor, Matthieu Cord, Laure Soulier, Benjamin Piwowarski
Large Language Models have demonstrated remarkable performance across various tasks, exhibiting the capacity to swiftly acquire new skills, such as through In-Context Learning (ICL) with minimal demonstration examples.
no code implementations • 19 Feb 2024 • Raphaël Mouravieff, Benjamin Piwowarski, Sylvain Lamprier
Table Question-Answering involves both understanding the natural language query and grounding it in the context of the input table to extract the relevant information.
no code implementations • 21 Jan 2024 • Mathias Vast, Yuxuan Zong, Basile Van Cooten, Benjamin Piwowarski, Laure Soulier
In Information Retrieval, and more generally in Natural Language Processing, adapting models to specific domains is conducted through fine-tuning.
1 code implementation • 20 Feb 2023 • Guglielmo Faggioli, Thibault Formal, Stefano Marchesin, Stéphane Clinchant, Nicola Ferro, Benjamin Piwowarski
On top of that, in lexical-oriented scenarios, QPPs fail to predict performance for neural IR systems on those queries where they differ from traditional approaches the most.
2 code implementations • 26 Jan 2023 • Laura Nguyen, Thomas Scialom, Benjamin Piwowarski, Jacopo Staiano
Text Summarization is a popular task and an active area of research for the Natural Language Processing community.
1 code implementation • 11 Jan 2023 • Nam Le Hai, Thomas Gerald, Thibault Formal, Jian-Yun Nie, Benjamin Piwowarski, Laure Soulier
Conversational search is a difficult task as it aims at retrieving documents based not only on the current user query but also on the full conversation history.
1 code implementation • 10 May 2022 • Thibault Formal, Carlos Lassance, Benjamin Piwowarski, Stéphane Clinchant
Neural retrievers based on dense representations combined with Approximate Nearest Neighbors search have recently received a lot of attention, owing their success to distillation and/or better sampling of examples for training -- while still relying on the same backbone architecture.
1 code implementation • 25 Apr 2022 • Antoine Chaffin, Thomas Scialom, Sylvain Lamprier, Jacopo Staiano, Benjamin Piwowarski, Ewa Kijak, Vincent Claveau
Language models generate texts by successively predicting probability distributions for next tokens given past ones.
no code implementations • 28 Jan 2022 • Sylvain Lamprier, Thomas Scialom, Antoine Chaffin, Vincent Claveau, Ewa Kijak, Jacopo Staiano, Benjamin Piwowarski
Generative Adversarial Networks (GANs) have known a tremendous success for many continuous generation tasks, especially in the field of image generation.
no code implementations • 10 Dec 2021 • Thibault Formal, Benjamin Piwowarski, Stéphane Clinchant
Neural Information Retrieval models hold the promise to replace lexical matching models, e. g. BM25, in modern search engines.
1 code implementation • 21 Sep 2021 • Thibault Formal, Carlos Lassance, Benjamin Piwowarski, Stéphane Clinchant
Meanwhile, there has been a growing interest in learning \emph{sparse} representations for documents and queries, that could inherit from the desirable properties of bag-of-words models such as the exact matching of terms and the efficiency of inverted indexes.
Ranked #5 on Zero-shot Text Search on BEIR
1 code implementation • Findings (EMNLP) 2021 • Laura Nguyen, Thomas Scialom, Jacopo Staiano, Benjamin Piwowarski
Motivated by human reading strategies, this paper presents Skim-Attention, a new attention mechanism that takes advantage of the structure of the document and its layout.
1 code implementation • 12 Jul 2021 • Thibault Formal, Benjamin Piwowarski, Stéphane Clinchant
In neural Information Retrieval, ongoing research is directed towards improving the first retriever in ranking pipelines.
no code implementations • NeurIPS 2021 • Thomas Scialom, Paul-Alexis Dray, Sylvain Lamprier, Benjamin Piwowarski, Jacopo Staiano
Due to the discrete nature of words, language GANs require to be optimized from rewards provided by discriminator networks, via reinforcement learning methods.
2 code implementations • EMNLP 2021 • Clément Rebuffel, Thomas Scialom, Laure Soulier, Benjamin Piwowarski, Sylvain Lamprier, Jacopo Staiano, Geoffrey Scoutheeten, Patrick Gallinari
QuestEval is a reference-less metric used in text-to-text tasks, that compares the generated summaries directly to the source text, by automatically asking and answering questions.
1 code implementation • EMNLP 2021 • Thomas Scialom, Paul-Alexis Dray, Patrick Gallinari, Sylvain Lamprier, Benjamin Piwowarski, Jacopo Staiano, Alex Wang
Summarization evaluation remains an open research problem: current metrics such as ROUGE are known to be limited and to correlate poorly with human judgments.
no code implementations • 17 Dec 2020 • Thibault Formal, Benjamin Piwowarski, Stéphane Clinchant
Transformer-based models are nowadays state-of-the-art in ad-hoc Information Retrieval, but their behavior is far from being understood.
no code implementations • 13 Nov 2020 • Patrick Bordes, Eloi Zablocki, Benjamin Piwowarski, Patrick Gallinari
We show the efficiency of our Cross-Modal CycleGAN model (CM-GAN) on the ImageNet T-ZSL task where we obtain state-of-the-art results.
no code implementations • NeurIPS 2020 • Thomas Scialom, Paul-Alexis Dray, Sylvain Lamprier, Benjamin Piwowarski, Jacopo Staiano
Training regimes based on Maximum Likelihood Estimation (MLE) suffer from known limitations, often leading to poorly generated text sequences.
no code implementations • EMNLP 2020 • Thomas Scialom, Paul-Alexis Dray, Sylvain Lamprier, Benjamin Piwowarski, Jacopo Staiano
We present MLSUM, the first large-scale MultiLingual SUMmarization dataset.
1 code implementation • ICML 2020 • Thomas Scialom, Paul-Alexis Dray, Sylvain Lamprier, Benjamin Piwowarski, Jacopo Staiano
We introduce a novel approach for sequence decoding, Discriminative Adversarial Search (DAS), which has the desirable properties of alleviating the effects of exposure bias without requiring external metrics.
no code implementations • IJCNLP 2019 • Patrick Bordes, Eloi Zablocki, Laure Soulier, Benjamin Piwowarski, Patrick Gallinari
To overcome this limitation, we propose to transfer visual information to textual representations by learning an intermediate representation space: the grounded space.
2 code implementations • IJCNLP 2019 • Thomas Scialom, Sylvain Lamprier, Benjamin Piwowarski, Jacopo Staiano
Abstractive summarization approaches based on Reinforcement Learning (RL) have recently been proposed to overcome classical likelihood maximization.
no code implementations • ACL 2019 • Thomas Scialom, Benjamin Piwowarski, Jacopo Staiano
Neural architectures based on self-attention, such as Transformers, recently attracted interest from the research community, and obtained significant improvements over the state of the art in several tasks.
no code implementations • ACL 2019 • {\'E}tienne Simon, Vincent Guigue, Benjamin Piwowarski
Unsupervised relation extraction aims at extracting relations between entities in text.
no code implementations • 24 Apr 2019 • Eloi Zablocki, Patrick Bordes, Benjamin Piwowarski, Laure Soulier, Patrick Gallinari
Zero-Shot Learning (ZSL) aims at classifying unlabeled objects by leveraging auxiliary knowledge, such as semantic representations.
no code implementations • 9 Nov 2017 • Éloi Zablocki, Benjamin Piwowarski, Laure Soulier, Patrick Gallinari
Representing the semantics of words is a long-standing problem for the natural language processing community.
no code implementations • 21 May 2016 • Gaurav Singh, Benjamin Piwowarski
We present a novel method for efficiently searching top-k neighbors for documents represented in high dimensional space of terms based on the cosine similarity.
no code implementations • 6 Oct 2015 • Benjamin Piwowarski, Sylvain Lamprier, Nicolas Despres
Although they present good abilities to cope with both term dependencies and vocabulary mismatch problems, thanks to the distributed representation of words they are based upon, such models could not be used readily in IR, where the estimation of one language model per document (or query) is required.