2 code implementations • 20 Feb 2024 • Hippolyte Gisserot-Boukhlef, Manuel Faysse, Emmanuel Malherbe, Céline Hudelot, Pierre Colombo
Neural Information Retrieval (NIR) has significantly improved upon heuristic-based IR systems.
no code implementations • 14 Feb 2024 • Matthieu Meeus, Igor Shilov, Manuel Faysse, Yves-Alexandre de Montjoye
We here propose to use copyright traps, the inclusion of fictitious entries in original content, to detect the use of copyrighted materials in LLMs with a focus on models where memorization does not naturally occur.
1 code implementation • 1 Feb 2024 • Manuel Faysse, Patrick Fernandes, Nuno M. Guerreiro, António Loison, Duarte M. Alves, Caio Corro, Nicolas Boizard, João Alves, Ricardo Rei, Pedro H. Martins, Antoni Bigata Casademunt, François Yvon, André F. T. Martins, Gautier Viaud, Céline Hudelot, Pierre Colombo
We introduce CroissantLLM, a 1. 3B language model pretrained on a set of 3T English and French tokens, to bring to the research and industrial community a high-performance, fully open-sourced bilingual model that runs swiftly on consumer-grade local hardware.
1 code implementation • 21 Oct 2023 • Manuel Faysse, Gautier Viaud, Céline Hudelot, Pierre Colombo
Instruction Fine-Tuning (IFT) is a powerful paradigm that strengthens the zero-shot capabilities of Large Language Models (LLMs), but in doing so induces new evaluation metric requirements.