Search Results for author: Sainbayar Sukhbaatar

Found 20 papers, 12 papers with code

Staircase Attention for Recurrent Processing of Sequences

no code implementations8 Jun 2021 Da Ju, Stephen Roller, Sainbayar Sukhbaatar, Jason Weston

Attention mechanisms have become a standard tool for sequence modeling tasks, in particular by stacking self-attention layers over the entire input sequence as in the Transformer architecture.

Language Modelling

Hash Layers For Large Sparse Models

no code implementations8 Jun 2021 Stephen Roller, Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston

We investigate the training of sparse layers that use different parameters for different inputs based on hashing in large Transformer models.

Language Modelling

Not All Memories are Created Equal: Learning to Forget by Expiring

1 code implementation13 May 2021 Sainbayar Sukhbaatar, Da Ju, Spencer Poff, Stephen Roller, Arthur Szlam, Jason Weston, Angela Fan

We demonstrate that Expire-Span can help models identify and retain critical information and show it can achieve strong performance on reinforcement learning tasks specifically designed to challenge this functionality.

Language Modelling

Not All Memories are Created Equal: Learning to Expire

1 code implementation1 Jan 2021 Sainbayar Sukhbaatar, Da Ju, Spencer Poff, Stephen Roller, Arthur Szlam, Jason E Weston, Angela Fan

We demonstrate that Expire-Span can help models identify and retain critical information and show it can achieve state of the art results on long-context language modeling, reinforcement learning, and algorithmic tasks.

Language Modelling

Learning to Visually Navigate in Photorealistic Environments Without any Supervision

no code implementations10 Apr 2020 Lina Mezghani, Sainbayar Sukhbaatar, Arthur Szlam, Armand Joulin, Piotr Bojanowski

Learning to navigate in a realistic setting where an agent must rely solely on visual inputs is a challenging task, in part because the lack of position information makes it difficult to provide supervision during training.

Augmenting Self-attention with Persistent Memory

5 code implementations2 Jul 2019 Sainbayar Sukhbaatar, Edouard Grave, Guillaume Lample, Herve Jegou, Armand Joulin

More precisely, we augment the self-attention layers with persistent memory vectors that play a similar role as the feed-forward layer.

Language Modelling

Training Hybrid Language Models by Marginalizing over Segmentations

no code implementations ACL 2019 Edouard Grave, Sainbayar Sukhbaatar, Piotr Bojanowski, Arm Joulin,

In this paper, we study the problem of hybrid language modeling, that is using models which can predict both characters and larger units such as character ngrams or words.

Language Modelling

Planning with Arithmetic and Geometric Attributes

no code implementations6 Sep 2018 David Folqué, Sainbayar Sukhbaatar, Arthur Szlam, Joan Bruna

A desirable property of an intelligent agent is its ability to understand its environment to quickly generalize to novel tasks and compose simpler tasks into more complex ones.

Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play

4 code implementations ICLR 2018 Sainbayar Sukhbaatar, Zeming Lin, Ilya Kostrikov, Gabriel Synnaeve, Arthur Szlam, Rob Fergus

When Bob is deployed on an RL task within the environment, this unsupervised training reduces the number of supervised episodes needed to learn, and in some cases converges to a higher reward.

MazeBase: A Sandbox for Learning from Games

2 code implementations23 Nov 2015 Sainbayar Sukhbaatar, Arthur Szlam, Gabriel Synnaeve, Soumith Chintala, Rob Fergus

This paper introduces MazeBase: an environment for simple 2D games, designed as a sandbox for machine learning approaches to reasoning and planning.

Starcraft

Training Convolutional Networks with Noisy Labels

no code implementations9 Jun 2014 Sainbayar Sukhbaatar, Joan Bruna, Manohar Paluri, Lubomir Bourdev, Rob Fergus

The availability of large labeled datasets has allowed Convolutional Network models to achieve impressive recognition results.

General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.