Search Results for author: Florian Strub

Found 35 papers, 21 papers with code

Learning Natural Language Generation with Truncated Reinforcement Learning

1 code implementation • NAACL 2022 • Alice Martin, Guillaume Quispe, Charles Ollion, Sylvain Le Corff, Florian Strub, Olivier Pietquin

To our knowledge, it is the first approach that successfully learns a language generation policy without pre-training, using only reinforcement learning.

Language Modelling Question Generation +4

Paper
Code

Language Evolution with Deep Learning

no code implementations • 18 Mar 2024 • Mathieu Rita, Paul Michel, Rahma Chaabouni, Olivier Pietquin, Emmanuel Dupoux, Florian Strub

Computational modeling plays an essential role in the study of language emergence.

Paper
Add Code

Language Model Alignment with Elastic Reset

1 code implementation • NeurIPS 2023 • Michael Noukhovitch, Samuel Lavoie, Florian Strub, Aaron Courville

We periodically reset the online model to an exponentially moving average (EMA) of itself, then reset the EMA model to the initial model.

Chatbot Language Modelling +1

Paper
Code

The Edge of Orthogonality: A Simple View of What Makes BYOL Tick

no code implementations • 9 Feb 2023 • Pierre H. Richemond, Allison Tam, Yunhao Tang, Florian Strub, Bilal Piot, Felix Hill

With simple linear algebra, we show that when using a linear predictor, the optimal predictor is close to an orthogonal projection, and propose a general framework based on orthonormalization that enables to interpret and give intuition on why BYOL works.

Paper
Add Code

SemPPL: Predicting pseudo-labels for better contrastive representations

2 code implementations • 12 Jan 2023 • Matko Bošnjak, Pierre H. Richemond, Nenad Tomasev, Florian Strub, Jacob C. Walker, Felix Hill, Lars Holger Buesing, Razvan Pascanu, Charles Blundell, Jovana Mitrovic

We propose a new semi-supervised learning method, Semantic Positives via Pseudo-Labels (SemPPL), that combines labelled and unlabelled data to learn informative representations.

Contrastive Learning Pseudo Label

Paper
Code

Over-communicate no more: Situated RL agents learn concise communication protocols

no code implementations • 2 Nov 2022 • Aleksandra Kalinowska, Elnaz Davoodi, Florian Strub, Kory W Mathewson, Ivana Kajic, Michael Bowling, Todd D Murphey, Patrick M Pilarski

While it is known that communication facilitates cooperation in multi-agent settings, it is unclear how to design artificial agents that can learn to effectively and efficiently communicate with each other.

Reinforcement Learning (RL)

Paper
Add Code

Emergent Communication: Generalization and Overfitting in Lewis Games

1 code implementation • 30 Sep 2022 • Mathieu Rita, Corentin Tallec, Paul Michel, Jean-bastien Grill, Olivier Pietquin, Emmanuel Dupoux, Florian Strub

Lewis signaling games are a class of simple communication games for simulating the emergence of language.

Paper
Code

Developing, Evaluating and Scaling Learning Agents in Multi-Agent Environments

no code implementations • 22 Sep 2022 • Ian Gemp, Thomas Anthony, Yoram Bachrach, Avishkar Bhoopchand, Kalesha Bullard, Jerome Connor, Vibhavari Dasagi, Bart De Vylder, Edgar Duenez-Guzman, Romuald Elie, Richard Everett, Daniel Hennes, Edward Hughes, Mina Khan, Marc Lanctot, Kate Larson, Guy Lever, SiQi Liu, Luke Marris, Kevin R. McKee, Paul Muller, Julien Perolat, Florian Strub, Andrea Tacchetti, Eugene Tarassov, Zhe Wang, Karl Tuyls

The Game Theory & Multi-Agent team at DeepMind studies several aspects of multi-agent learning ranging from computing approximations to fundamental concepts in game theory to simulating social dilemmas in rich spatial environments and training 3-d humanoids in difficult team coordination tasks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

1 code implementation • 30 Jun 2022 • Julien Perolat, Bart De Vylder, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas Anthony, Stephen Mcaleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, Mark Rowland, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, Edward Lockhart, Laurent SIfre, Nathalie Beauguerlange, Remi Munos, David Silver, Satinder Singh, Demis Hassabis, Karl Tuyls

It has the additional complexity of requiring decision-making under imperfect information, similar to Texas hold'em poker, which has a significantly smaller game tree (on the order of $10^{164}$ nodes).

Board Games Decision Making +2

3,989

Paper
Code

Emergent Communication at Scale

1 code implementation • ICLR 2022 • Rahma Chaabouni, Florian Strub, Florent Altché, Eugene Tarassov, Corentin Tallec, Elnaz Davoodi, Kory Wallace Mathewson, Olivier Tieleman, Angeliki Lazaridou, Bilal Piot

Emergent communication aims for a better understanding of human language evolution and building more efficient representations.

Imitation Learning

Paper
Code

Learning Natural Language Generation from Scratch

no code implementations • 20 Sep 2021 • Alice Martin Donati, Guillaume Quispe, Charles Ollion, Sylvain Le Corff, Florian Strub, Olivier Pietquin

This paper introduces TRUncated ReinForcement Learning for Language (TrufLL), an original ap-proach to train conditional language models from scratch by only using reinforcement learning (RL).

Language Modelling reinforcement-learning +2

Paper
Add Code

Don't Do What Doesn't Matter: Intrinsic Motivation with Action Usefulness

1 code implementation • 20 May 2021 • Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin

Sparse rewards are double-edged training signals in reinforcement learning: easy to design but hard to optimize.

Paper
Code

Broaden Your Views for Self-Supervised Video Learning

1 code implementation • ICCV 2021 • Adrià Recasens, Pauline Luc, Jean-Baptiste Alayrac, Luyu Wang, Ross Hemsley, Florian Strub, Corentin Tallec, Mateusz Malinowski, Viorica Patraucean, Florent Altché, Michal Valko, Jean-bastien Grill, Aäron van den Oord, Andrew Zisserman

Most successful self-supervised learning methods are trained to align the representations of two independent views from the data.

Ranked #1 on Self-Supervised Action Recognition on HMDB51 (finetuned)

Audio Classification Optical Flow Estimation +4

Paper
Code

Relevant Action Matters : Motivating agent with action usefulness

no code implementations • ICLR Workshop SSL-RL 2021 • Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin

We evaluate RAM on the procedurally-generated environment MiniGrid, against state-of-the-art methods.

Paper
Add Code

Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning

8 code implementations • NeurIPS 2020 • Jean-bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Remi Munos, Michal Valko

From an augmented view of an image, we train the online network to predict the target network representation of the same image under a different augmented view.

Representation Learning Self-Supervised Learning

3,078

Paper
Code

BYOL works even without batch statistics

3 code implementations • 20 Oct 2020 • Pierre H. Richemond, Jean-bastien Grill, Florent Altché, Corentin Tallec, Florian Strub, Andrew Brock, Samuel Smith, Soham De, Razvan Pascanu, Bilal Piot, Michal Valko

Bootstrap Your Own Latent (BYOL) is a self-supervised learning approach for image representation.

Self-Supervised Learning

1,684

Paper
Code

Supervised Seeded Iterated Learning for Interactive Language Learning

no code implementations • EMNLP 2020 • Yuchen Lu, Soumye Singhal, Florian Strub, Olivier Pietquin, Aaron Courville

Language drift has been one of the major obstacles to train language models through interaction.

Translation

Paper
Add Code

A Machine of Few Words -- Interactive Speaker Recognition with Reinforcement Learning

no code implementations • 7 Aug 2020 • Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin

To do so, we cast the speaker recognition task into a sequential decision-making problem that we solve with Reinforcement Learning.

Decision Making reinforcement-learning +3

Paper
Add Code

The Monte Carlo Transformer: a stochastic self-attention model for sequence prediction

no code implementations • 15 Jul 2020 • Alice Martin, Charles Ollion, Florian Strub, Sylvain Le Corff, Olivier Pietquin

This paper introduces the Sequential Monte Carlo Transformer, an original approach that naturally captures the observations distribution in a transformer architecture.

Paper
Add Code

Bootstrap your own latent: A new approach to self-supervised Learning

31 code implementations • 13 Jun 2020 • Jean-bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H. Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, Michal Valko

From an augmented view of an image, we train the online network to predict the target network representation of the same image under a different augmented view.

Ranked #2 on Self-Supervised Person Re-Identification on SYSU-30k

Representation Learning Self-Supervised Image Classification +3

12,779

Paper
Code

Countering Language Drift with Seeded Iterated Learning

no code implementations • ICML 2020 • Yuchen Lu, Soumye Singhal, Florian Strub, Olivier Pietquin, Aaron Courville

At each time step, the teacher is created by copying the student agent, before being finetuned to maximize task completion.

Translation

Paper
Add Code

HIGhER : Improving instruction following with Hindsight Generation for Experience Replay

no code implementations • 21 Oct 2019 • Geoffrey Cideron, Mathieu Seurin, Florian Strub, Olivier Pietquin

Language creates a compact representation of the world and allows the description of unlimited situations and objectives through compositionality.

Instruction Following Language Acquisition

Paper
Add Code

Self-Educated Language Agent with Hindsight Experience Replay for Instruction Following

no code implementations • 25 Sep 2019 • Geoffrey Cideron, Mathieu Seurin, Florian Strub, Olivier Pietquin

Language creates a compact representation of the world and allows the description of unlimited situations and objectives through compositionality.

Instruction Following Language Acquisition

Paper
Add Code

Correction of Electron Back-scattered Diffraction datasets using an evolutionary algorithm

1 code implementation • 7 Mar 2019 • Florian Strub, Marie-Agathe Charpagne, Tresa M. Pollock

The quality of the reconstruction of the maps is critical to study the spatial distribution of phases and crystallographic orientation relationships between phases, a key interest in materials science.

Paper
Code

Accurate reconstruction of EBSD datasets by a multimodal data approach using an evolutionary algorithm

1 code implementation • 7 Mar 2019 • Marie-Agathe Charpagne, Florian Strub, Tresa M. Pollock

This function is then applied to un-distort the EBSD data, and the phase information is inferred using the data of the segmented speckle.

Paper
Code

Deep Reinforcement Learning and the Deadly Triad

no code implementations • 6 Dec 2018 • Hado van Hasselt, Yotam Doron, Florian Strub, Matteo Hessel, Nicolas Sonnerat, Joseph Modayil

In this work, we investigate the impact of the deadly triad in practice, in the context of a family of popular deep reinforcement learning models - deep Q-networks trained with experience replay - analysing how the components of this system play a role in the emergence of the deadly triad, and in the agent's performance

Learning Theory reinforcement-learning +1

Paper
Add Code

Visual Reasoning with Multi-hop Feature Modulation

1 code implementation • ECCV 2018 • Florian Strub, Mathieu Seurin, Ethan Perez, Harm de Vries, Jérémie Mary, Philippe Preux, Aaron Courville, Olivier Pietquin

Recent breakthroughs in computer vision and natural language processing have spurred interest in challenging multi-modal tasks such as visual question-answering and visual dialogue.

Question Answering Visual Dialog +2

Paper
Code

HoME: a Household Multimodal Environment

no code implementations • 29 Nov 2017 • Simon Brodeur, Ethan Perez, Ankesh Anand, Florian Golemo, Luca Celotti, Florian Strub, Jean Rouat, Hugo Larochelle, Aaron Courville

We introduce HoME: a Household Multimodal Environment for artificial agents to learn from vision, audio, semantics, physics, and interaction with objects and other agents, all within a realistic context.

OpenAI Gym reinforcement-learning +1

Paper
Add Code

FiLM: Visual Reasoning with a General Conditioning Layer

6 code implementations • 22 Sep 2017 • Ethan Perez, Florian Strub, Harm de Vries, Vincent Dumoulin, Aaron Courville

We introduce a general-purpose conditioning method for neural networks called FiLM: Feature-wise Linear Modulation.

Ranked #3 on Visual Question Answering (VQA) on CLEVR-Humans

Image Retrieval with Multi-Modal Query Visual Question Answering (VQA) +1

296

Paper
Code

Learning Visual Reasoning Without Strong Priors

2 code implementations • 10 Jul 2017 • Ethan Perez, Harm de Vries, Florian Strub, Vincent Dumoulin, Aaron Courville

Previous work has operated under the assumption that visual reasoning calls for a specialized architecture, but we show that a general architecture with proper conditioning can learn to visually reason effectively.

Visual Reasoning

296

Paper
Code

Modulating early visual processing by language

3 code implementations • NeurIPS 2017 • Harm de Vries, Florian Strub, Jérémie Mary, Hugo Larochelle, Olivier Pietquin, Aaron Courville

It is commonly assumed that language refers to high-level visual concepts while leaving low-level visual processing unaffected.

Question Answering Visual Question Answering

Paper
Code

End-to-end optimization of goal-driven and visually grounded dialogue systems

2 code implementations • 15 Mar 2017 • Florian Strub, Harm de Vries, Jeremie Mary, Bilal Piot, Aaron Courville, Olivier Pietquin

End-to-end design of dialogue systems has recently become a popular research topic thanks to powerful tools such as encoder-decoder architectures for sequence-to-sequence learning.

Dialogue Management Management +1

Paper
Code

GuessWhat?! Visual object discovery through multi-modal dialogue

4 code implementations • CVPR 2017 • Harm de Vries, Florian Strub, Sarath Chandar, Olivier Pietquin, Hugo Larochelle, Aaron Courville

Our key contribution is the collection of a large-scale dataset consisting of 150K human-played games with a total of 800K visual question-answer pairs on 66K images.

Object Object Discovery

Paper
Code

Hybrid Recommender System based on Autoencoders

4 code implementations • 24 Jun 2016 • Florian Strub, Romaric Gaudel, Jérémie Mary

A standard model for Recommender Systems is the Matrix Completion setting: given partially known matrix of ratings given by users (rows) to items (columns), infer the unknown ratings.

Ranked #1 on Recommendation Systems on Douban

Collaborative Filtering Matrix Completion +1

170

Paper
Code

Hybrid Collaborative Filtering with Autoencoders

1 code implementation • 2 Mar 2016 • Florian Strub, Jeremie Mary, Romaric Gaudel

Such algorithms look for latent variables in a large sparse matrix of ratings.

Collaborative Filtering speech-recognition

170

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.