Search Results for author: Florian Strub

Found 25 papers, 15 papers with code

Learning Natural Language Generation from Scratch

no code implementations20 Sep 2021 Alice Martin Donati, Guillaume Quispe, Charles Ollion, Sylvain Le Corff, Florian Strub, Olivier Pietquin

This paper introduces TRUncated ReinForcement Learning for Language (TrufLL), an original ap-proach to train conditional language models from scratch by only using reinforcement learning (RL).

Language Modelling Text Generation

Don't Do What Doesn't Matter: Intrinsic Motivation with Action Usefulness

1 code implementation20 May 2021 Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin

Sparse rewards are double-edged training signals in reinforcement learning: easy to design but hard to optimize.

A Machine of Few Words -- Interactive Speaker Recognition with Reinforcement Learning

no code implementations7 Aug 2020 Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin

To do so, we cast the speaker recognition task into a sequential decision-making problem that we solve with Reinforcement Learning.

Decision Making Speaker Recognition +1

The Monte Carlo Transformer: a stochastic self-attention model for sequence prediction

no code implementations15 Jul 2020 Alice Martin, Charles Ollion, Florian Strub, Sylvain Le Corff, Olivier Pietquin

This paper introduces the Sequential Monte Carlo Transformer, an original approach that naturally captures the observations distribution in a transformer architecture.

Countering Language Drift with Seeded Iterated Learning

no code implementations ICML 2020 Yuchen Lu, Soumye Singhal, Florian Strub, Olivier Pietquin, Aaron Courville

At each time step, the teacher is created by copying the student agent, before being finetuned to maximize task completion.

Translation

HIGhER : Improving instruction following with Hindsight Generation for Experience Replay

no code implementations21 Oct 2019 Geoffrey Cideron, Mathieu Seurin, Florian Strub, Olivier Pietquin

Language creates a compact representation of the world and allows the description of unlimited situations and objectives through compositionality.

Language Acquisition

Self-Educated Language Agent with Hindsight Experience Replay for Instruction Following

no code implementations25 Sep 2019 Geoffrey Cideron, Mathieu Seurin, Florian Strub, Olivier Pietquin

Language creates a compact representation of the world and allows the description of unlimited situations and objectives through compositionality.

Language Acquisition

Correction of Electron Back-scattered Diffraction datasets using an evolutionary algorithm

1 code implementation7 Mar 2019 Florian Strub, Marie-Agathe Charpagne, Tresa M. Pollock

The quality of the reconstruction of the maps is critical to study the spatial distribution of phases and crystallographic orientation relationships between phases, a key interest in materials science.

Accurate reconstruction of EBSD datasets by a multimodal data approach using an evolutionary algorithm

1 code implementation7 Mar 2019 Marie-Agathe Charpagne, Florian Strub, Tresa M. Pollock

This function is then applied to un-distort the EBSD data, and the phase information is inferred using the data of the segmented speckle.

Deep Reinforcement Learning and the Deadly Triad

no code implementations6 Dec 2018 Hado van Hasselt, Yotam Doron, Florian Strub, Matteo Hessel, Nicolas Sonnerat, Joseph Modayil

In this work, we investigate the impact of the deadly triad in practice, in the context of a family of popular deep reinforcement learning models - deep Q-networks trained with experience replay - analysing how the components of this system play a role in the emergence of the deadly triad, and in the agent's performance

Learning Theory

Visual Reasoning with Multi-hop Feature Modulation

1 code implementation ECCV 2018 Florian Strub, Mathieu Seurin, Ethan Perez, Harm de Vries, Jérémie Mary, Philippe Preux, Aaron Courville, Olivier Pietquin

Recent breakthroughs in computer vision and natural language processing have spurred interest in challenging multi-modal tasks such as visual question-answering and visual dialogue.

Question Answering Visual Dialog +2

HoME: a Household Multimodal Environment

no code implementations29 Nov 2017 Simon Brodeur, Ethan Perez, Ankesh Anand, Florian Golemo, Luca Celotti, Florian Strub, Jean Rouat, Hugo Larochelle, Aaron Courville

We introduce HoME: a Household Multimodal Environment for artificial agents to learn from vision, audio, semantics, physics, and interaction with objects and other agents, all within a realistic context.

OpenAI Gym

Learning Visual Reasoning Without Strong Priors

2 code implementations10 Jul 2017 Ethan Perez, Harm de Vries, Florian Strub, Vincent Dumoulin, Aaron Courville

Previous work has operated under the assumption that visual reasoning calls for a specialized architecture, but we show that a general architecture with proper conditioning can learn to visually reason effectively.

Visual Reasoning

End-to-end optimization of goal-driven and visually grounded dialogue systems

2 code implementations15 Mar 2017 Florian Strub, Harm de Vries, Jeremie Mary, Bilal Piot, Aaron Courville, Olivier Pietquin

End-to-end design of dialogue systems has recently become a popular research topic thanks to powerful tools such as encoder-decoder architectures for sequence-to-sequence learning.

Dialogue Management Visual Question Answering

GuessWhat?! Visual object discovery through multi-modal dialogue

3 code implementations CVPR 2017 Harm de Vries, Florian Strub, Sarath Chandar, Olivier Pietquin, Hugo Larochelle, Aaron Courville

Our key contribution is the collection of a large-scale dataset consisting of 150K human-played games with a total of 800K visual question-answer pairs on 66K images.

Object Discovery

Hybrid Recommender System based on Autoencoders

4 code implementations24 Jun 2016 Florian Strub, Romaric Gaudel, Jérémie Mary

A standard model for Recommender Systems is the Matrix Completion setting: given partially known matrix of ratings given by users (rows) to items (columns), infer the unknown ratings.

Collaborative Filtering Matrix Completion +1

Hybrid Collaborative Filtering with Autoencoders

1 code implementation2 Mar 2016 Florian Strub, Jeremie Mary, Romaric Gaudel

Such algorithms look for latent variables in a large sparse matrix of ratings.

Collaborative Filtering

Cannot find the paper you are looking for? You can Submit a new open access paper.