Search Results for author: Michael Noukhovitch

Found 10 papers, 6 papers with code

The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR Summarization

1 code implementation • 24 Mar 2024 • Shengyi Huang, Michael Noukhovitch, Arian Hosseini, Kashif Rasul, Weixun Wang, Lewis Tunstall

This work is the first to openly reproduce the Reinforcement Learning from Human Feedback (RLHF) scaling behaviors reported in OpenAI's seminal TL;DR summarization work.

reinforcement-learning

Paper
Code

Language Model Alignment with Elastic Reset

1 code implementation • NeurIPS 2023 • Michael Noukhovitch, Samuel Lavoie, Florian Strub, Aaron Courville

We periodically reset the online model to an exponentially moving average (EMA) of itself, then reset the EMA model to the initial model.

Chatbot Language Modelling +1

Paper
Code

Learning Multi-Agent Communication with Contrastive Learning

no code implementations • 3 Jul 2023 • Yat Long Lo, Biswa Sengupta, Jakob Foerster, Michael Noukhovitch

By examining the relationship between messages sent and received, we propose to learn to communicate using contrastive learning to maximize the mutual information between messages of a given trajectory.

Contrastive Learning

Paper
Add Code

Simplicial Embeddings in Self-Supervised Learning and Downstream Classification

1 code implementation • 1 Apr 2022 • Samuel Lavoie, Christos Tsirigotis, Max Schwarzer, Ankit Vani, Michael Noukhovitch, Kenji Kawaguchi, Aaron Courville

Simplicial Embeddings (SEM) are representations learned through self-supervised learning (SSL), wherein a representation is projected into $L$ simplices of $V$ dimensions each using a softmax operation.

Classification Inductive Bias +1

Paper
Code

Pretraining Representations for Data-Efficient Reinforcement Learning

1 code implementation • NeurIPS 2021 • Max Schwarzer, Nitarshan Rajkumar, Michael Noukhovitch, Ankesh Anand, Laurent Charlin, Devon Hjelm, Philip Bachman, Aaron Courville

Data efficiency is a key challenge for deep reinforcement learning.

Ranked #3 on Atari Games 100k on Atari 100k (using extra training data)

Atari Games Atari Games 100k +2

Paper
Code

Pretraining Reward-Free Representations for Data-Efficient Reinforcement Learning

no code implementations • ICLR Workshop SSL-RL 2021 • Max Schwarzer, Nitarshan Rajkumar, Michael Noukhovitch, Ankesh Anand, Laurent Charlin, R Devon Hjelm, Philip Bachman, Aaron Courville

Data efficiency poses a major challenge for deep reinforcement learning.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Emergent Communication under Competition

1 code implementation • 25 Jan 2021 • Michael Noukhovitch, Travis LaCroix, Angeliki Lazaridou, Aaron Courville

First, we show that communication is proportional to cooperation, and it can occur for partially competitive scenarios using standard learning algorithms.

Misconceptions

Paper
Code

Selfish Emergent Communication

no code implementations • 25 Sep 2019 • Michael Noukhovitch, Travis LaCroix, Aaron Courville

Current literature in machine learning holds that unaligned, self-interested agents do not learn to use an emergent communication channel.

Paper
Add Code

Systematic Generalization: What Is Required and Can It Be Learned?

2 code implementations • ICLR 2019 • Dzmitry Bahdanau, Shikhar Murty, Michael Noukhovitch, Thien Huu Nguyen, Harm de Vries, Aaron Courville

Numerous models for grounded language understanding have been recently proposed, including (i) generic models that can be easily adapted to any given task and (ii) intuitively appealing modular models that require background knowledge to be instantiated.

Systematic Generalization Visual Question Answering (VQA)

Paper
Code

Commonsense mining as knowledge base completion? A study on the impact of novelty

no code implementations • WS 2018 • Stanisław Jastrzębski, Dzmitry Bahdanau, Seyedarian Hosseini, Michael Noukhovitch, Yoshua Bengio, Jackie Chi Kit Cheung

Commonsense knowledge bases such as ConceptNet represent knowledge in the form of relational triples.

Knowledge Base Completion

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.