URLB (Unsupervised Reinforcement Learning Benchmark)

Introduced by Laskin et al. in URLB: Unsupervised Reinforcement Learning Benchmark

URLB consists of two phases: reward-free pre-training and downstream task adaptation with extrinsic rewards. Building on the DeepMind Control Suite, it provides twelve continuous control tasks from three domains for evaluation.

Homepage

Benchmarks

Add a new result Link an existing benchmark

Task	Dataset Variant	Best Model
Unsupervised Reinforcement Learning	URLB (pixels, 2*10^6 frames)	Mastering URLB
Unsupervised Reinforcement Learning	URLB (pixels, 10^5 frames)	ProtoRL
Unsupervised Reinforcement Learning	URLB (states, 10^6 frames)	RND
Unsupervised Reinforcement Learning	URLB (states, 2*10^6 frames)	RND
Unsupervised Reinforcement Learning	URLB (states, 10^5 frames)	Disagreement
Unsupervised Reinforcement Learning	URLB (pixels, 5*10^5 frames)	Disagreement
Unsupervised Reinforcement Learning	URLB (pixels, 10^6 frames)	Disagreement
Unsupervised Reinforcement Learning	URLB (states, 5*10^5 frames)	RND