no code implementations • 23 Sep 2024 • Benjamin Clavié, Antoine Chaffin, Griffin Adams
This method can reduce the space & memory footprint of ColBERT indexes by 50% with virtually no retrieval performance degradation.
1 code implementation • 30 Aug 2024 • Benjamin Clavié
This paper presents rerankers, a Python library which provides an easy-to-use interface to the most commonly used re-ranking approaches.
no code implementations • 30 Jul 2024 • Benjamin Clavié
Neural Information Retrieval has advanced rapidly in high-resource languages, but progress in lower-resource ones such as Japanese has been hindered by data scarcity, among other challenges.
no code implementations • 26 Dec 2023 • Benjamin Clavié
As language-specific training data tends to be sparsely available compared to English, document retrieval in many languages has been largely relying on multilingual models.
no code implementations • 7 Jul 2023 • Benjamin Clavié, Guillaume Soulié
We generate synthetic training data for the entirety of ESCO skills and train a classifier to extract skill mentions from job posts.
no code implementations • 13 Mar 2023 • Benjamin Clavié, Alexandru Ciceu, Frederick Naylor, Guillaume Soulié, Thomas Brightwell
Furthermore, we observe that the wording of the prompt is a critical factor in eliciting the appropriate "reasoning" in the model, and that seemingly minor aspects of the prompt significantly affect the model's performance.
no code implementations • 15 Sep 2021 • Benjamin Clavié, Marc Alphonsus
We aim to highlight an interesting trend to contribute to the ongoing debate around advances within legal Natural Language Processing.
Ranked #8 on Natural Language Understanding on LexGLUE
no code implementations • 2 Sep 2021 • Benjamin Clavié, Akshita Gheewala, Paul Briton, Marc Alphonsus, Rym Laabiyad, Francesco Piccoli
Large Transformer-based language models such as BERT have led to broad performance improvements on many NLP tasks.
no code implementations • 13 Jul 2020 • Benjamin Clavié, K. Gal
We introduce DeepPerfEmb, or DPE, a new deep-learning model that captures dense representations of students’ online behaviour and meta-data about students and educational content.
no code implementations • 2 Dec 2019 • Benjamin Clavié, Kobi Gal
The use of large pretrained neural networks to create contextualized word embeddings has drastically improved performance on several natural language processing (NLP) tasks.