1 code implementation • 9 Oct 2024 • Pravesh Agrawal, Szymon Antoniak, Emma Bou Hanna, Baptiste Bout, Devendra Chaplot, Jessica Chudnovsky, Diogo Costa, Baudouin De Monicault, Saurabh Garg, Theophile Gervet, Soham Ghosh, Amélie Héliou, Paul Jacob, Albert Q. Jiang, Kartik Khandelwal, Timothée Lacroix, Guillaume Lample, Diego Las Casas, Thibaut Lavril, Teven Le Scao, Andy Lo, William Marshall, Louis Martin, Arthur Mensch, Pavankumar Muddireddy, Valera Nemychnikova, Marie Pellat, Patrick von Platen, Nikhil Raghuraman, Baptiste Rozière, Alexandre Sablayrolles, Lucile Saulnier, Romain Sauvestre, Wendy Shang, Roman Soletskyi, Lawrence Stewart, Pierre Stock, Joachim Studnia, Sandeep Subramanian, Sagar Vaze, Thomas Wang, Sophia Yang
Unlike many open-source models, Pixtral is also a cutting-edge text model for its size, and does not compromise on natural language performance to excel in multimodal tasks.
5 code implementations • 8 Jan 2024 • Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Sandeep Subramanian, Sophia Yang, Szymon Antoniak, Teven Le Scao, Théophile Gervet, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed
In particular, Mixtral vastly outperforms Llama 2 70B on mathematics, code generation, and multilingual benchmarks.
Ranked #12 on Question Answering on PIQA
7 code implementations • 10 Oct 2023 • Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, Lélio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed
We introduce Mistral 7B v0. 1, a 7-billion-parameter language model engineered for superior performance and efficiency.
Ranked #5 on Zero-Shot Video Question Answer on NExT-GQA
1 code implementation • 7 Jun 2023 • Alexandre Sablayrolles, Yue Wang, Brian Karrer
Privately generating synthetic data from a table is an important brick of a privacy-first world.
no code implementations • 24 Oct 2022 • Chuan Guo, Alexandre Sablayrolles, Maziar Sanjabi
Differential privacy (DP) is by far the most widely accepted framework for mitigating privacy risks in machine learning.
1 code implementation • 7 Oct 2022 • Tom Sander, Pierre Stock, Alexandre Sablayrolles
Differentially Private methods for training Deep Neural Networks (DNNs) have progressed recently, in particular with the use of massive batches and aggregated data augmentations for a large number of training steps.
1 code implementation • 6 Oct 2022 • Samuel Maddock, Alexandre Sablayrolles, Pierre Stock
We propose a novel method, CANIFE, that uses canaries - carefully crafted samples by a strong adversary to evaluate the empirical privacy of a training round.
no code implementations • 12 Apr 2022 • Saeed Mahloujifar, Alexandre Sablayrolles, Graham Cormode, Somesh Jha
A common countermeasure against MI attacks is to utilize differential privacy (DP) during model training to mask the presence of individual examples.
no code implementations • 15 Feb 2022 • Pierre Stock, Igor Shilov, Ilya Mironov, Alexandre Sablayrolles
Reconstruction attacks allow an adversary to regenerate data samples of the training set using access to only a trained model.
1 code implementation • 17 Dec 2021 • Pierre Fernandez, Alexandre Sablayrolles, Teddy Furon, Hervé Jégou, Matthijs Douze
We revisit watermarking techniques based on pre-trained deep networks, in the light of self-supervised approaches.
no code implementations • 17 Dec 2021 • Kenza Amara, Matthijs Douze, Alexandre Sablayrolles, Hervé Jégou
Modern approaches for fast retrieval of similar vectors on billion-scaled datasets rely on compressed-domain approaches such as binary sketches or product quantization.
3 code implementations • 25 Sep 2021 • Ashkan Yousefpour, Igor Shilov, Alexandre Sablayrolles, Davide Testuggine, Karthik Prasad, Mani Malek, John Nguyen, Sayan Ghosh, Akash Bharadwaj, Jessica Zhao, Graham Cormode, Ilya Mironov
We introduce Opacus, a free, open-source PyTorch library for training deep learning models with differential privacy (hosted at opacus. ai).
2 code implementations • EMNLP 2021 • Chuan Guo, Alexandre Sablayrolles, Hervé Jégou, Douwe Kiela
We propose the first general-purpose gradient-based attack against transformer models.
19 code implementations • ICCV 2021 • Hugo Touvron, Matthieu Cord, Alexandre Sablayrolles, Gabriel Synnaeve, Hervé Jégou
In particular, we investigate the interplay of architecture and optimization of such dedicated transformers.
Ranked #5 on Image Classification on Stanford Cars
34 code implementations • 23 Dec 2020 • Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou
In this work, we produce a competitive convolution-free transformer by training on Imagenet only.
Ranked #4 on Efficient ViTs on ImageNet-1K (with DeiT-S)
no code implementations • ICCV 2021 • Hugo Touvron, Alexandre Sablayrolles, Matthijs Douze, Matthieu Cord, Hervé Jégou
By jointly leveraging the coarse labels and the underlying fine-grained latent space, it significantly improves the accuracy of category-level retrieval methods.
Ranked #2 on Learning with coarse labels on cifar100
Fine-Grained Image Classification Learning with coarse labels +3
2 code implementations • ICML 2020 • Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou
The mark is robust to strong variations such as different architectures or optimization methods.
no code implementations • 29 Aug 2019 • Alexandre Sablayrolles, Matthijs Douze, Yann Ollivier, Cordelia Schmid, Hervé Jégou
Membership inference determines, given a sample and trained parameters of a machine learning model, whether the sample was part of the training set.
7 code implementations • NeurIPS 2019 • Guillaume Lample, Alexandre Sablayrolles, Marc'Aurelio Ranzato, Ludovic Denoyer, Hervé Jégou
In our experiments we consider a dataset with up to 30 billion words, and we plug our memory layer in a state-of-the-art transformer-based architecture.
no code implementations • ICLR 2019 • Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou
Convolutional neural networks memorize part of their training data, which is why strategies such as data augmentation and drop-out are employed to mitigate overfitting.
2 code implementations • ICLR 2019 • Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou
Discretizing multi-dimensional data distributions is a fundamental step of modern indexing methods.
7 code implementations • CVPR 2018 • Matthijs Douze, Alexandre Sablayrolles, Hervé Jégou
Similarity search approaches based on graph walks have recently attained outstanding speed-accuracy trade-offs, taking aside the memory requirements.
1 code implementation • 21 Sep 2016 • Alexandre Sablayrolles, Matthijs Douze, Hervé Jégou, Nicolas Usunier
Hashing produces compact representations for documents, to perform tasks like classification or retrieval based on these short codes.