Search Results for author: Anton Bakhtin

Found 18 papers, 9 papers with code

Towards Measuring the Representation of Subjective Global Opinions in Language Models

1 code implementation28 Jun 2023 Esin Durmus, Karina Nguyen, Thomas I. Liao, Nicholas Schiefer, Amanda Askell, Anton Bakhtin, Carol Chen, Zac Hatfield-Dodds, Danny Hernandez, Nicholas Joseph, Liane Lovitt, Sam McCandlish, Orowa Sikder, Alex Tamkin, Janel Thamkul, Jared Kaplan, Jack Clark, Deep Ganguli

We first build a dataset, GlobalOpinionQA, comprised of questions and answers from cross-national surveys designed to capture diverse opinions on global issues across different countries.

Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning

1 code implementation11 Oct 2022 Anton Bakhtin, David J Wu, Adam Lerer, Jonathan Gray, Athul Paul Jacob, Gabriele Farina, Alexander H Miller, Noam Brown

We then show that DiL-piKL can be extended into a self-play reinforcement learning algorithm we call RL-DiL-piKL that provides a model of human play while simultaneously training an agent that responds well to this human model.

reinforcement-learning Reinforcement Learning (RL)

Self-Explaining Deviations for Coordination

no code implementations13 Jul 2022 Hengyuan Hu, Samuel Sokota, David Wu, Anton Bakhtin, Andrei Lupu, Brandon Cui, Jakob N. Foerster

Fully cooperative, partially observable multi-agent problems are ubiquitous in the real world.

Modeling Strong and Human-Like Gameplay with KL-Regularized Search

no code implementations14 Dec 2021 Athul Paul Jacob, David J. Wu, Gabriele Farina, Adam Lerer, Hengyuan Hu, Anton Bakhtin, Jacob Andreas, Noam Brown

We consider the task of building strong but human-like policies in multi-agent decision-making problems, given examples of human behavior.

Imitation Learning

No-Press Diplomacy from Scratch

1 code implementation NeurIPS 2021 Anton Bakhtin, David Wu, Adam Lerer, Noam Brown

Additionally, we extend our methods to full-scale no-press Diplomacy and for the first time train an agent from scratch with no human data.


Physical Reasoning Using Dynamics-Aware Models

1 code implementation20 Feb 2021 Eltayeb Ahmed, Anton Bakhtin, Laurens van der Maaten, Rohit Girdhar

A common approach to solving physical reasoning tasks is to train a value learner on example tasks.

Visual Reasoning

Human-Level Performance in No-Press Diplomacy via Equilibrium Search

no code implementations ICLR 2021 Jonathan Gray, Adam Lerer, Anton Bakhtin, Noam Brown

Prior AI breakthroughs in complex games have focused on either the purely adversarial or purely cooperative settings.

Combining Deep Reinforcement Learning and Search for Imperfect-Information Games

1 code implementation NeurIPS 2020 Noam Brown, Anton Bakhtin, Adam Lerer, Qucheng Gong

This paper presents ReBeL, a general framework for self-play reinforcement learning and search that provably converges to a Nash equilibrium in any two-player zero-sum game.

reinforcement-learning Reinforcement Learning (RL)

Residual Energy-Based Models for Text Generation

1 code implementation ICLR 2020 Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'Aurelio Ranzato

In this work, we investigate un-normalized energy-based models (EBMs) which operate not at the token but at the sequence level.

Language Modelling Machine Translation +2

Residual Energy-Based Models for Text

no code implementations6 Apr 2020 Anton Bakhtin, Yuntian Deng, Sam Gross, Myle Ott, Marc'Aurelio Ranzato, Arthur Szlam

Current large-scale auto-regressive language models display impressive fluency and can generate convincing text.

PHYRE: A New Benchmark for Physical Reasoning

2 code implementations NeurIPS 2019 Anton Bakhtin, Laurens van der Maaten, Justin Johnson, Laura Gustafson, Ross Girshick

The benchmark is designed to encourage the development of learning algorithms that are sample-efficient and generalize well across puzzles.

Visual Reasoning

GenEval: A Benchmark Suite for Evaluating Generative Models

no code implementations27 Sep 2018 Anton Bakhtin, Arthur Szlam, Marc'Aurelio Ranzato

In this work, we aim at addressing this problem by introducing a new benchmark evaluation suite, dubbed \textit{GenEval}.

Lightweight Adaptive Mixture of Neural and N-gram Language Models

no code implementations20 Apr 2018 Anton Bakhtin, Arthur Szlam, Marc'Aurelio Ranzato, Edouard Grave

It is often the case that the best performing language model is an ensemble of a neural language model with n-grams.

Language Modelling

Streaming Small-Footprint Keyword Spotting using Sequence-to-Sequence Models

no code implementations26 Oct 2017 Yanzhang He, Rohit Prabhavalkar, Kanishka Rao, Wei Li, Anton Bakhtin, Ian McGraw

We develop streaming keyword spotting systems using a recurrent neural network transducer (RNN-T) model: an all-neural, end-to-end trained, sequence-to-sequence model which jointly learns acoustic and language model components.

General Classification Language Modelling +1

On the efficient representation and execution of deep acoustic models

no code implementations15 Jul 2016 Raziel Alvarez, Rohit Prabhavalkar, Anton Bakhtin

In this paper we present a simple and computationally efficient quantization scheme that enables us to reduce the resolution of the parameters of a neural network from 32-bit floating point values to 8-bit integer values.

Quantization speech-recognition +1

Cannot find the paper you are looking for? You can Submit a new open access paper.