1 code implementation • 28 Jun 2023 • Esin Durmus, Karina Nguyen, Thomas I. Liao, Nicholas Schiefer, Amanda Askell, Anton Bakhtin, Carol Chen, Zac Hatfield-Dodds, Danny Hernandez, Nicholas Joseph, Liane Lovitt, Sam McCandlish, Orowa Sikder, Alex Tamkin, Janel Thamkul, Jared Kaplan, Jack Clark, Deep Ganguli
We first build a dataset, GlobalOpinionQA, comprised of questions and answers from cross-national surveys designed to capture diverse opinions on global issues across different countries.
1 code implementation • Science 2022 • Anton Bakhtin, Noam Brown, Emily Dinan, Gabriele Farina, Colin Flaherty, Daniel Fried, Andrew Goff, Jonathan Gray, Hengyan Hu, Athul Paul Jacob, Mojtaba Komeili, Karthik Konath, Minae Kwon, Adam Lerer, Mike Lewis, Alexander H. Miller, Sash Mitts, Aditya Renduchintala, Stephen Roller, Dirk Rowe, Weiyan Shi, Joe Spisak, Alexander Wei, David Wu, Hugh Zhang, Markus Zijlstra
Despite much progress in training AI systems to imitate human language, building agents that use language to communicate intentionally with humans in interactive environments remains a major challenge.
1 code implementation • 11 Oct 2022 • Anton Bakhtin, David J Wu, Adam Lerer, Jonathan Gray, Athul Paul Jacob, Gabriele Farina, Alexander H Miller, Noam Brown
We then show that DiL-piKL can be extended into a self-play reinforcement learning algorithm we call RL-DiL-piKL that provides a model of human play while simultaneously training an agent that responds well to this human model.
no code implementations • 13 Jul 2022 • Hengyuan Hu, Samuel Sokota, David Wu, Anton Bakhtin, Andrei Lupu, Brandon Cui, Jakob N. Foerster
Fully cooperative, partially observable multi-agent problems are ubiquitous in the real world.
no code implementations • 14 Dec 2021 • Athul Paul Jacob, David J. Wu, Gabriele Farina, Adam Lerer, Hengyuan Hu, Anton Bakhtin, Jacob Andreas, Noam Brown
We consider the task of building strong but human-like policies in multi-agent decision-making problems, given examples of human behavior.
1 code implementation • NeurIPS 2021 • Anton Bakhtin, David Wu, Adam Lerer, Noam Brown
Additionally, we extend our methods to full-scale no-press Diplomacy and for the first time train an agent from scratch with no human data.
1 code implementation • 20 Feb 2021 • Eltayeb Ahmed, Anton Bakhtin, Laurens van der Maaten, Rohit Girdhar
A common approach to solving physical reasoning tasks is to train a value learner on example tasks.
Ranked #1 on Visual Reasoning on PHYRE-1B-Within
no code implementations • ICLR 2021 • Jonathan Gray, Adam Lerer, Anton Bakhtin, Noam Brown
Prior AI breakthroughs in complex games have focused on either the purely adversarial or purely cooperative settings.
1 code implementation • NeurIPS 2020 • Noam Brown, Anton Bakhtin, Adam Lerer, Qucheng Gong
This paper presents ReBeL, a general framework for self-play reinforcement learning and search that provably converges to a Nash equilibrium in any two-player zero-sum game.
1 code implementation • ICLR 2020 • Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'Aurelio Ranzato
In this work, we investigate un-normalized energy-based models (EBMs) which operate not at the token but at the sequence level.
no code implementations • 6 Apr 2020 • Anton Bakhtin, Yuntian Deng, Sam Gross, Myle Ott, Marc'Aurelio Ranzato, Arthur Szlam
Current large-scale auto-regressive language models display impressive fluency and can generate convincing text.
1 code implementation • IJCNLP 2019 • Fabio Petroni, Tim Rocktäschel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, Alexander H. Miller, Sebastian Riedel
Recent progress in pretraining language models on large textual corpora led to a surge of improvements for downstream NLP tasks.
2 code implementations • NeurIPS 2019 • Anton Bakhtin, Laurens van der Maaten, Justin Johnson, Laura Gustafson, Ross Girshick
The benchmark is designed to encourage the development of learning algorithms that are sample-efficient and generalize well across puzzles.
Ranked #3 on Visual Reasoning on PHYRE-1B-Within
no code implementations • 7 Jun 2019 • Anton Bakhtin, Sam Gross, Myle Ott, Yuntian Deng, Marc'Aurelio Ranzato, Arthur Szlam
Energy-based models (EBMs), a. k. a.
no code implementations • 27 Sep 2018 • Anton Bakhtin, Arthur Szlam, Marc'Aurelio Ranzato
In this work, we aim at addressing this problem by introducing a new benchmark evaluation suite, dubbed \textit{GenEval}.
no code implementations • 20 Apr 2018 • Anton Bakhtin, Arthur Szlam, Marc'Aurelio Ranzato, Edouard Grave
It is often the case that the best performing language model is an ensemble of a neural language model with n-grams.
no code implementations • 26 Oct 2017 • Yanzhang He, Rohit Prabhavalkar, Kanishka Rao, Wei Li, Anton Bakhtin, Ian McGraw
We develop streaming keyword spotting systems using a recurrent neural network transducer (RNN-T) model: an all-neural, end-to-end trained, sequence-to-sequence model which jointly learns acoustic and language model components.
no code implementations • 15 Jul 2016 • Raziel Alvarez, Rohit Prabhavalkar, Anton Bakhtin
In this paper we present a simple and computationally efficient quantization scheme that enables us to reduce the resolution of the parameters of a neural network from 32-bit floating point values to 8-bit integer values.