1 code implementation • 9 Oct 2024 • Pravesh Agrawal, Szymon Antoniak, Emma Bou Hanna, Baptiste Bout, Devendra Chaplot, Jessica Chudnovsky, Diogo Costa, Baudouin De Monicault, Saurabh Garg, Theophile Gervet, Soham Ghosh, Amélie Héliou, Paul Jacob, Albert Q. Jiang, Kartik Khandelwal, Timothée Lacroix, Guillaume Lample, Diego Las Casas, Thibaut Lavril, Teven Le Scao, Andy Lo, William Marshall, Louis Martin, Arthur Mensch, Pavankumar Muddireddy, Valera Nemychnikova, Marie Pellat, Patrick von Platen, Nikhil Raghuraman, Baptiste Rozière, Alexandre Sablayrolles, Lucile Saulnier, Romain Sauvestre, Wendy Shang, Roman Soletskyi, Lawrence Stewart, Pierre Stock, Joachim Studnia, Sandeep Subramanian, Sagar Vaze, Thomas Wang, Sophia Yang
Unlike many open-source models, Pixtral is also a cutting-edge text model for its size, and does not compromise on natural language performance to excel in multimodal tasks.
5 code implementations • 8 Jan 2024 • Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Sandeep Subramanian, Sophia Yang, Szymon Antoniak, Teven Le Scao, Théophile Gervet, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed
In particular, Mixtral vastly outperforms Llama 2 70B on mathematics, code generation, and multilingual benchmarks.
Ranked #12 on
Common Sense Reasoning
on ARC (Easy)
6 code implementations • 10 Oct 2023 • Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, Lélio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed
We introduce Mistral 7B v0. 1, a 7-billion-parameter language model engineered for superior performance and efficiency.
Ranked #1 on
answerability prediction
on PeerQA
3 code implementations • 29 May 2023 • Zechun Liu, Barlas Oguz, Changsheng Zhao, Ernie Chang, Pierre Stock, Yashar Mehdad, Yangyang Shi, Raghuraman Krishnamoorthi, Vikas Chandra
Several post-training quantization methods have been applied to large language models (LLMs), and have been shown to perform well down to 8-bits.
no code implementations • 22 May 2023 • Xinchi Qiu, Ilias Leontiadis, Luca Melis, Alex Sablayrolles, Pierre Stock
In particular, on-device machine learning allows us to avoid sharing raw data with a third-party server during inference.
no code implementations • 26 Mar 2023 • Ashkan Yousefpour, Shen Guo, Ashish Shenoy, Sayan Ghosh, Pierre Stock, Kiwan Maeng, Schalk-Willem Krüger, Michael Rabbat, Carole-Jean Wu, Ilya Mironov
The rapid progress of AI is fueled by increasingly large and computationally intensive machine learning models and datasets.
1 code implementation • 8 Nov 2022 • Chuan Guo, Kamalika Chaudhuri, Pierre Stock, Mike Rabbat
In private federated learning (FL), a server aggregates differentially private updates from a large number of clients in order to train a machine learning model.
1 code implementation • 7 Oct 2022 • Tom Sander, Pierre Stock, Alexandre Sablayrolles
Differentially Private methods for training Deep Neural Networks (DNNs) have progressed recently, in particular with the use of massive batches and aggregated data augmentations for a large number of training steps.
1 code implementation • 6 Oct 2022 • Samuel Maddock, Alexandre Sablayrolles, Pierre Stock
We propose a novel method, CANIFE, that uses canaries - carefully crafted samples by a strong adversary to evaluate the empirical privacy of a training round.
1 code implementation • 26 Jul 2022 • Karthik Prasad, Sayan Ghosh, Graham Cormode, Ilya Mironov, Ashkan Yousefpour, Pierre Stock
Cross-device Federated Learning is an increasingly popular machine learning setting to train a model by leveraging a large population of client devices with high privacy and security guarantees.
no code implementations • 15 Feb 2022 • Pierre Stock, Igor Shilov, Ilya Mironov, Alexandre Sablayrolles
Reconstruction attacks allow an adversary to regenerate data samples of the training set using access to only a trained model.
no code implementations • 20 Jul 2021 • Pierre Stock, Rémi Gribonval
The overall objective of this paper is to introduce an embedding for ReLU neural networks of any depth, $\Phi(\theta)$, that is invariant to scalings and that provides a locally linear parameterization of the realization of the network.
12 code implementations • ICCV 2021 • Ben Graham, Alaaeldin El-Nouby, Hugo Touvron, Pierre Stock, Armand Joulin, Hervé Jégou, Matthijs Douze
We design a family of image classification architectures that optimize the trade-off between accuracy and efficiency in a high-speed regime.
Ranked #15 on
Image Classification
on Stanford Cars
no code implementations • 1 Dec 2020 • Maxime Oquab, Pierre Stock, Oran Gafni, Daniel Haziza, Tao Xu, Peizhao Zhang, Onur Celebi, Yana Hasson, Patrick Labatut, Bobo Bose-Kolanu, Thibault Peyronel, Camille Couprie
To unlock video chat for hundreds of millions of people hindered by poor connectivity or unaffordable data costs, we propose to authentically reconstruct faces on the receiver's device using facial landmarks extracted at the sender's side and transmitted over the network.
4 code implementations • ICLR 2021 • Angela Fan, Pierre Stock, Benjamin Graham, Edouard Grave, Remi Gribonval, Herve Jegou, Armand Joulin
A standard solution is to train networks with Quantization Aware Training, where the weights are quantized during training and the gradients approximated with the Straight-Through Estimator.
3 code implementations • ICLR 2020 • Pierre Stock, Armand Joulin, Rémi Gribonval, Benjamin Graham, Hervé Jégou
In this paper, we address the problem of reducing the memory footprint of convolutional network architectures.
1 code implementation • ICLR 2019 • Pierre Stock, Benjamin Graham, Rémi Gribonval, Hervé Jégou
Modern neural networks are over-parametrized.
no code implementations • ECCV 2018 • Pierre Stock, Moustapha Cisse
ConvNets and Imagenet have driven the recent success of deep learning for image classification.