TASK |
DATASET |
MODEL |
METRIC NAME |
METRIC VALUE |
GLOBAL RANK |
REMOVE |
Unsupervised Reinforcement Learning
|
URLB (pixels, 10^5 frames)
|
ICM
|
Walker (mean normalized return)
|
28.58±11.32
|
# 2
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 10^5 frames)
|
ICM
|
Quadruped (mean normalized return)
|
24.36±7.41
|
# 5
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 10^5 frames)
|
ICM
|
Jaco (mean normalized return)
|
14.29±3.28
|
# 4
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 10^6 frames)
|
ICM
|
Walker (mean normalized return)
|
35.18±20.26
|
# 2
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 10^6 frames)
|
ICM
|
Quadruped (mean normalized return)
|
33.75±10.25
|
# 2
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 10^6 frames)
|
ICM
|
Jaco (mean normalized return)
|
38.93±3.16
|
# 1
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 2*10^6 frames)
|
ICM
|
Walker (mean normalized return)
|
29.56±14.76
|
# 5
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 2*10^6 frames)
|
ICM
|
Quadruped (mean normalized return)
|
36.27±11.44
|
# 3
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 2*10^6 frames)
|
ICM
|
Jaco (mean normalized return)
|
35.95±7.23
|
# 3
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 5*10^5 frames)
|
ICM
|
Walker (mean normalized return)
|
34.65±18.78
|
# 2
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 5*10^5 frames)
|
ICM
|
Quadruped (mean normalized return)
|
30.08±8.84
|
# 2
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 5*10^5 frames)
|
ICM
|
Jaco (mean normalized return)
|
34.43±7.09
|
# 1
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 10^5 frames)
|
ICM
|
Walker (mean normalized return)
|
78.32±32.41
|
# 4
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 10^5 frames)
|
ICM
|
Quadruped (mean normalized return)
|
29.70±8.87
|
# 6
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 10^5 frames)
|
ICM
|
Jaco (mean normalized return)
|
71.96±7.20
|
# 3
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 10^6 frames)
|
ICM
|
Walker (mean normalized return)
|
77.57±34.01
|
# 3
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 10^6 frames)
|
ICM
|
Quadruped (mean normalized return)
|
28.83±12.75
|
# 8
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 10^6 frames)
|
ICM
|
Jaco (mean normalized return)
|
65.57±7.78
|
# 2
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 2*10^6 frames)
|
ICM
|
Walker (mean normalized return)
|
74.03±29.67
|
# 4
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 2*10^6 frames)
|
ICM
|
Quadruped (mean normalized return)
|
23.44±10.64
|
# 8
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 2*10^6 frames)
|
ICM
|
Jaco (mean normalized return)
|
59.50±5.53
|
# 3
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 5*10^5 frames)
|
ICM
|
Walker (mean normalized return)
|
80.46±31.50
|
# 4
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 5*10^5 frames)
|
ICM
|
Quadruped (mean normalized return)
|
30.59±9.33
|
# 7
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 5*10^5 frames)
|
ICM
|
Jaco (mean normalized return)
|
60.43±9.86
|
# 5
|
|