Unsupervised Reinforcement Learning

22 papers with code • 8 benchmarks • 2 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

Exploration by Random Network Distillation

openai/random-network-distillation ICLR 2019

In particular we establish state of the art performance on Montezuma's Revenge, a game famously difficult for deep reinforcement learning methods.

Curiosity-driven Exploration by Self-supervised Prediction

pathak22/noreward-rl ICML 2017

In many real-world scenarios, rewards extrinsic to the agent are extremely sparse, or absent altogether.

Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning

facebookresearch/drqv2 ICLR 2022

We present DrQ-v2, a model-free reinforcement learning (RL) algorithm for visual continuous control.

Diversity is All You Need: Learning Skills without a Reward Function

navneet-nmk/Hierarchical-Meta-Reinforcement-Learning ICLR 2019

On a variety of simulated robotic tasks, we show that this simple objective results in the unsupervised emergence of diverse skills, such as walking and jumping.

Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement Learning

google-research/dads 27 Apr 2020

Can we instead develop efficient reinforcement learning methods that acquire diverse skills without any reward function, and then repurpose these skills for downstream tasks?

Unsupervised Reinforcement Learning in Multiple Environments

muttimirco/alphamepol 16 Dec 2021

Along this line, we address the problem of unsupervised reinforcement learning in a class of multiple environments, in which the policy is pre-trained with interactions from the whole class, and then fine-tuned for several tasks in any environment of the class.

Variational Intrinsic Control

jbinas/gym-mnist 22 Nov 2016

In this paper we introduce a new unsupervised reinforcement learning method for discovering the set of intrinsic options available to an agent.

Self-Supervised Exploration via Disagreement

pathak22/exploration-by-disagreement 10 Jun 2019

In this paper, we propose a formulation for exploration inspired by the work in active learning literature.

Efficient Exploration via State Marginal Matching

RLAgent/state-marginal-matching 12 Jun 2019

The SMM objective can be viewed as a two-player, zero-sum game between a state density model and a parametric policy, an idea that we use to build an algorithm for optimizing the SMM objective.

SMiRL: Surprise Minimizing Reinforcement Learning in Unstable Environments

chenziku/train-procgen ICLR 2021

Every living organism struggles against disruptive environmental forces to carve out and maintain an orderly niche.