Search Results for author: Anton Bakhtin

Found 18 papers, 9 papers with code

Towards Measuring the Representation of Subjective Global Opinions in Language Models

1 code implementation • 28 Jun 2023 • Esin Durmus, Karina Nguyen, Thomas I. Liao, Nicholas Schiefer, Amanda Askell, Anton Bakhtin, Carol Chen, Zac Hatfield-Dodds, Danny Hernandez, Nicholas Joseph, Liane Lovitt, Sam McCandlish, Orowa Sikder, Alex Tamkin, Janel Thamkul, Jared Kaplan, Jack Clark, Deep Ganguli

We first build a dataset, GlobalOpinionQA, comprised of questions and answers from cross-national surveys designed to capture diverse opinions on global issues across different countries.

Paper
Code

Human-level play in the game of Diplomacy by combining language models with strategic reasoning

1 code implementation • Science 2022 • Anton Bakhtin, Noam Brown, Emily Dinan, Gabriele Farina, Colin Flaherty, Daniel Fried, Andrew Goff, Jonathan Gray, Hengyan Hu, Athul Paul Jacob, Mojtaba Komeili, Karthik Konath, Minae Kwon, Adam Lerer, Mike Lewis, Alexander H. Miller, Sash Mitts, Aditya Renduchintala, Stephen Roller, Dirk Rowe, Weiyan Shi, Joe Spisak, Alexander Wei, David Wu, Hugh Zhang, Markus Zijlstra

Despite much progress in training AI systems to imitate human language, building agents that use language to communicate intentionally with humans in interactive environments remains a major challenge.

1,245

Paper
Code

Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning

1 code implementation • 11 Oct 2022 • Anton Bakhtin, David J Wu, Adam Lerer, Jonathan Gray, Athul Paul Jacob, Gabriele Farina, Alexander H Miller, Noam Brown

We then show that DiL-piKL can be extended into a self-play reinforcement learning algorithm we call RL-DiL-piKL that provides a model of human play while simultaneously training an agent that responds well to this human model.

reinforcement-learning Reinforcement Learning (RL)

1,245

Paper
Code

Self-Explaining Deviations for Coordination

no code implementations • 13 Jul 2022 • Hengyuan Hu, Samuel Sokota, David Wu, Anton Bakhtin, Andrei Lupu, Brandon Cui, Jakob N. Foerster

Fully cooperative, partially observable multi-agent problems are ubiquitous in the real world.

Paper
Add Code

Modeling Strong and Human-Like Gameplay with KL-Regularized Search

no code implementations • 14 Dec 2021 • Athul Paul Jacob, David J. Wu, Gabriele Farina, Adam Lerer, Hengyuan Hu, Anton Bakhtin, Jacob Andreas, Noam Brown

We consider the task of building strong but human-like policies in multi-agent decision-making problems, given examples of human behavior.

Imitation Learning

Paper
Add Code

No-Press Diplomacy from Scratch

1 code implementation • NeurIPS 2021 • Anton Bakhtin, David Wu, Adam Lerer, Noam Brown

Additionally, we extend our methods to full-scale no-press Diplomacy and for the first time train an agent from scratch with no human data.

Starcraft

Paper
Code

Physical Reasoning Using Dynamics-Aware Models

1 code implementation • 20 Feb 2021 • Eltayeb Ahmed, Anton Bakhtin, Laurens van der Maaten, Rohit Girdhar

A common approach to solving physical reasoning tasks is to train a value learner on example tasks.

Ranked #1 on Visual Reasoning on PHYRE-1B-Within

Visual Reasoning

Paper
Code

Human-Level Performance in No-Press Diplomacy via Equilibrium Search

no code implementations • ICLR 2021 • Jonathan Gray, Adam Lerer, Anton Bakhtin, Noam Brown

Prior AI breakthroughs in complex games have focused on either the purely adversarial or purely cooperative settings.

Paper
Add Code

Combining Deep Reinforcement Learning and Search for Imperfect-Information Games

1 code implementation • NeurIPS 2020 • Noam Brown, Anton Bakhtin, Adam Lerer, Qucheng Gong

This paper presents ReBeL, a general framework for self-play reinforcement learning and search that provably converges to a Nash equilibrium in any two-player zero-sum game.

reinforcement-learning Reinforcement Learning (RL)

625

Paper
Code

Residual Energy-Based Models for Text Generation

1 code implementation • ICLR 2020 • Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'Aurelio Ranzato

In this work, we investigate un-normalized energy-based models (EBMs) which operate not at the token but at the sequence level.

Language Modelling Machine Translation +2

Paper
Code

Residual Energy-Based Models for Text

no code implementations • 6 Apr 2020 • Anton Bakhtin, Yuntian Deng, Sam Gross, Myle Ott, Marc'Aurelio Ranzato, Arthur Szlam

Current large-scale auto-regressive language models display impressive fluency and can generate convincing text.

Paper
Add Code

Language Models as Knowledge Bases?

1 code implementation • IJCNLP 2019 • Fabio Petroni, Tim Rocktäschel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, Alexander H. Miller, Sebastian Riedel

Recent progress in pretraining language models on large textual corpora led to a surge of improvements for downstream NLP tasks.

Language Modelling Open-Domain Question Answering

1,314

Paper
Code

PHYRE: A New Benchmark for Physical Reasoning

2 code implementations • NeurIPS 2019 • Anton Bakhtin, Laurens van der Maaten, Justin Johnson, Laura Gustafson, Ross Girshick

The benchmark is designed to encourage the development of learning algorithms that are sample-efficient and generalize well across puzzles.

Ranked #3 on Visual Reasoning on PHYRE-1B-Within

Visual Reasoning

425

Paper
Code

Real or Fake? Learning to Discriminate Machine from Human Generated Text

no code implementations • 7 Jun 2019 • Anton Bakhtin, Sam Gross, Myle Ott, Yuntian Deng, Marc'Aurelio Ranzato, Arthur Szlam

Energy-based models (EBMs), a. k. a.

Language Modelling Text Generation

Paper
Add Code

GenEval: A Benchmark Suite for Evaluating Generative Models

no code implementations • 27 Sep 2018 • Anton Bakhtin, Arthur Szlam, Marc'Aurelio Ranzato

In this work, we aim at addressing this problem by introducing a new benchmark evaluation suite, dubbed \textit{GenEval}.

Paper
Add Code

Lightweight Adaptive Mixture of Neural and N-gram Language Models

no code implementations • 20 Apr 2018 • Anton Bakhtin, Arthur Szlam, Marc'Aurelio Ranzato, Edouard Grave

It is often the case that the best performing language model is an ensemble of a neural language model with n-grams.

Language Modelling

Paper
Add Code

Streaming Small-Footprint Keyword Spotting using Sequence-to-Sequence Models

no code implementations • 26 Oct 2017 • Yanzhang He, Rohit Prabhavalkar, Kanishka Rao, Wei Li, Anton Bakhtin, Ian McGraw

We develop streaming keyword spotting systems using a recurrent neural network transducer (RNN-T) model: an all-neural, end-to-end trained, sequence-to-sequence model which jointly learns acoustic and language model components.

General Classification Language Modelling +1

Paper
Add Code

On the efficient representation and execution of deep acoustic models

no code implementations • 15 Jul 2016 • Raziel Alvarez, Rohit Prabhavalkar, Anton Bakhtin

In this paper we present a simple and computationally efficient quantization scheme that enables us to reduce the resolution of the parameters of a neural network from 32-bit floating point values to 8-bit integer values.

Quantization speech-recognition +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.