15 code implementations • arXiv 2023 • Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample
We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters.
Ranked #1 on
Question Answering
on SIQA
no code implementations • 22 Feb 2023 • Pierre-Alexandre Kamienny, Guillaume Lample, Sylvain Lamprier, Marco Virgolin
Symbolic regression (SR) is the problem of learning a symbolic expression from numerical data.
2 code implementations • 21 Oct 2022 • Albert Q. Jiang, Sean Welleck, Jin Peng Zhou, Wenda Li, Jiacheng Liu, Mateja Jamnik, Timothée Lacroix, Yuhuai Wu, Guillaume Lample
In this work, we introduce Draft, Sketch, and Prove (DSP), a method that maps informal proofs to formal proof sketches, and uses the sketches to guide an automated prover by directing its search to easier sub-problems.
Ranked #1 on
Automated Theorem Proving
on miniF2F-test
(Pass@100 metric)
no code implementations • 23 May 2022 • Guillaume Lample, Marie-Anne Lachaux, Thibaut Lavril, Xavier Martinet, Amaury Hayat, Gabriel Ebner, Aurélien Rodriguez, Timothée Lacroix
With a similar computational budget, we improve the state of the art on the Lean-based miniF2F-curriculum dataset from 31% to 42% proving accuracy.
Ranked #1 on
Automated Theorem Proving
on Metamath set.mm
(Pass@32 metric)
1 code implementation • 22 Apr 2022 • Pierre-Alexandre Kamienny, Stéphane d'Ascoli, Guillaume Lample, François Charton
Symbolic regression, the task of predicting the mathematical expression of a function from the observation of its values, is a difficult task which usually involves a two-step procedure: predicting the "skeleton" of the expression up to the choice of numerical constants, then fitting the constants by optimizing a non-convex loss function.
no code implementations • 12 Jan 2022 • Stéphane d'Ascoli, Pierre-Alexandre Kamienny, Guillaume Lample, François Charton
Symbolic regression, i. e. predicting a function from the observation of its values, is well-known to be a challenging task.
1 code implementation • ICLR 2022 • Baptiste Roziere, Jie M. Zhang, Francois Charton, Mark Harman, Gabriel Synnaeve, Guillaume Lample
With little to no parallel data available for programming languages, unsupervised methods are well-suited to source code translation.
2 code implementations • NeurIPS 2021 • Baptiste Roziere, Marie-Anne Lachaux, Marc Szafraniec, Guillaume Lample
Recent advances in self-supervised learning have dramatically improved the state of the art on a wide variety of tasks.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Marie-Anne Lachaux, Armand Joulin, Guillaume Lample
In this paper, we propose to explicitly model this one-to-many mapping by conditioning the decoder of a NMT model on a latent variable that represents the domain of target sentences.
1 code implementation • ICLR 2021 • François Charton, Amaury Hayat, Guillaume Lample
Using transformers over large generated datasets, we train models to learn mathematical properties of differential systems, such as local stability, behavior at infinity and controllability.
8 code implementations • NeurIPS 2020 • Marie-Anne Lachaux, Baptiste Roziere, Lowik Chanussot, Guillaume Lample
We train our model on source code from open source GitHub projects, and show that it can translate functions between C++, Java, and Python with high accuracy.
7 code implementations • ICLR 2020 • Guillaume Lample, François Charton
Neural networks have a reputation for being better at solving statistical or approximate problems than at performing calculations or working with symbolic data.
1 code implementation • IJCNLP 2019 • Francisco Guzm{\'a}n, Peng-Jen Chen, Myle Ott, Juan Pino, Guillaume Lample, Philipp Koehn, Vishrav Chaudhary, Marc{'}Aurelio Ranzato
For machine translation, a vast majority of language pairs in the world are considered low-resource because they have little parallel data available.
7 code implementations • NeurIPS 2019 • Guillaume Lample, Alexandre Sablayrolles, Marc'Aurelio Ranzato, Ludovic Denoyer, Hervé Jégou
In our experiments we consider a dataset with up to 30 billion words, and we plug our memory layer in a state-of-the-art transformer-based architecture.
6 code implementations • 2 Jul 2019 • Sainbayar Sukhbaatar, Edouard Grave, Guillaume Lample, Herve Jegou, Armand Joulin
More precisely, we augment the self-attention layers with persistent memory vectors that play a similar role as the feed-forward layer.
Ranked #5 on
Language Modelling
on Text8
no code implementations • ICLR 2019 • Guillaume Lample, Sandeep Subramanian, Eric Smith, Ludovic Denoyer, Marc'Aurelio Ranzato, Y-Lan Boureau
The dominant approach to unsupervised "style transfer" in text is based on the idea of learning a latent representation, which is independent of the attributes specifying its "style".
2 code implementations • 4 Feb 2019 • Francisco Guzmán, Peng-Jen Chen, Myle Ott, Juan Pino, Guillaume Lample, Philipp Koehn, Vishrav Chaudhary, Marc'Aurelio Ranzato
For machine translation, a vast majority of language pairs in the world are considered low-resource because they have little parallel data available.
16 code implementations • NeurIPS 2019 • Guillaume Lample, Alexis Conneau
On unsupervised machine translation, we obtain 34. 3 BLEU on WMT'16 German-English, improving the previous state of the art by more than 9 BLEU.
3 code implementations • 1 Nov 2018 • Sandeep Subramanian, Guillaume Lample, Eric Michael Smith, Ludovic Denoyer, Marc'Aurelio Ranzato, Y-Lan Boureau
The dominant approach to unsupervised "style transfer" in text is based on the idea of learning a latent representation, which is independent of the attributes specifying its "style".
no code implementations • EMNLP 2018 • Guillaume Lample, Myle Ott, Alexis Conneau, Ludovic Denoyer, Marc{'}Aurelio Ranzato
Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of language pairs.
10 code implementations • EMNLP 2018 • Alexis Conneau, Guillaume Lample, Ruty Rinott, Adina Williams, Samuel R. Bowman, Holger Schwenk, Veselin Stoyanov
State-of-the-art natural language processing systems rely on supervision in the form of annotated data to learn competent models.
Ranked #5 on
Natural Language Inference
on XNLI French
Cross-Lingual Natural Language Inference
Machine Translation
+1
no code implementations • ACL 2018 • Alexis Conneau, German Kruszewski, Guillaume Lample, Lo{\"\i}c Barrault, Marco Baroni
Although much effort has recently been devoted to training high-quality sentence embeddings, we still have a poor understanding of what they are capturing.
6 code implementations • 3 May 2018 • Alexis Conneau, German Kruszewski, Guillaume Lample, Loïc Barrault, Marco Baroni
Although much effort has recently been devoted to training high-quality sentence embeddings, we still have a poor understanding of what they are capturing.
15 code implementations • EMNLP 2018 • Guillaume Lample, Myle Ott, Alexis Conneau, Ludovic Denoyer, Marc'Aurelio Ranzato
Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of language pairs.
Ranked #2 on
Machine Translation
on WMT2016 English-Russian
no code implementations • NeurIPS 2017 • Guillaume Lample, Neil Zeghidour, Nicolas Usunier, Antoine Bordes, Ludovic Denoyer, Marc'Aurelio Ranzato
This paper introduces a new encoder-decoder architecture that is trained to reconstruct images by disentangling the salient information of the image and the values of attributes directly in the latent space.
15 code implementations • ICLR 2018 • Guillaume Lample, Alexis Conneau, Ludovic Denoyer, Marc'Aurelio Ranzato
By learning to reconstruct in both languages from this shared feature space, the model effectively learns to translate without using any labeled data.
Ranked #6 on
Machine Translation
on WMT2016 German-English
18 code implementations • ICLR 2018 • Alexis Conneau, Guillaume Lample, Marc'Aurelio Ranzato, Ludovic Denoyer, Hervé Jégou
We finally describe experiments on the English-Esperanto low-resource language pair, on which there only exists a limited amount of parallel data, to show the potential impact of our method in fully unsupervised machine translation.
Ranked #2 on
Word Alignment
on en-es
3 code implementations • 1 Jun 2017 • Guillaume Lample, Neil Zeghidour, Nicolas Usunier, Antoine Bordes, Ludovic Denoyer, Marc'Aurelio Ranzato
This paper introduces a new encoder-decoder architecture that is trained to reconstruct images by disentangling the salient information of the image and the values of attributes directly in the latent space.
8 code implementations • 18 Sep 2016 • Guillaume Lample, Devendra Singh Chaplot
Advances in deep reinforcement learning have allowed autonomous agents to perform well on Atari games, often outperforming humans, using only raw pixels to make their decisions.
no code implementations • NAACL 2016 • Yulia Tsvetkov, Sunayana Sitaram, Manaal Faruqui, Guillaume Lample, Patrick Littell, David Mortensen, Alan W. black, Lori Levin, Chris Dyer
We introduce polyglot language models, recurrent neural network models trained to predict symbol sequences in many different languages using shared representations of symbols and conditioning on typological information about the language to be predicted.
43 code implementations • NAACL 2016 • Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, Chris Dyer
State-of-the-art named entity recognition systems rely heavily on hand-crafted features and domain-specific knowledge in order to learn effectively from the small, supervised training corpora that are available.
Ranked #8 on
Named Entity Recognition (NER)
on CoNLL++
1 code implementation • 5 Feb 2016 • Waleed Ammar, George Mulcaire, Yulia Tsvetkov, Guillaume Lample, Chris Dyer, Noah A. Smith
We introduce new methods for estimating and evaluating embeddings of words in more than fifty languages in a single shared embedding space.