Generalized Data Distribution Iteration

7 Jun 2022  ·  Jiajun Fan, Changnan Xiao ·

To obtain higher sample efficiency and superior final performance simultaneously has been one of the major challenges for deep reinforcement learning (DRL). Previous work could handle one of these challenges but typically failed to address them concurrently. In this paper, we try to tackle these two challenges simultaneously. To achieve this, we firstly decouple these challenges into two classic RL problems: data richness and exploration-exploitation trade-off. Then, we cast these two problems into the training data distribution optimization problem, namely to obtain desired training data within limited interactions, and address them concurrently via i) explicit modeling and control of the capacity and diversity of behavior policy and ii) more fine-grained and adaptive control of selective/sampling distribution of the behavior policy using a monotonic data distribution optimization. Finally, we integrate this process into Generalized Policy Iteration (GPI) and obtain a more general framework called Generalized Data Distribution Iteration (GDI). We use the GDI framework to introduce operator-based versions of well-known RL methods from DQN to Agent57. Theoretical guarantee of the superiority of GDI compared with GPI is concluded. We also demonstrate our state-of-the-art (SOTA) performance on Arcade Learning Environment (ALE), wherein our algorithm has achieved 9620.33% mean human normalized score (HNS), 1146.39% median HNS and surpassed 22 human world records using only 200M training frames. Our performance is comparable to Agent57's while we consume 500 times less data. We argue that there is still a long way to go before obtaining real superhuman agents in ALE.

PDF Abstract
No code implementations yet. Submit your code now

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Atari Games Atari 2600 Alien GDI-H3 Score 48735 # 6
Atari Games Atari 2600 Alien GDI-I3 Score 43384 # 7
Atari Games Atari 2600 Amidar GDI-I3 Score 1442 # 19
Atari Games Atari 2600 Amidar GDI-H3 Score 1065 # 23
Atari Games Atari 2600 Assault GDI-H3 Score 97155 # 3
Atari Games Atari 2600 Assault GDI-I3 Score 63876 # 5
Atari Games Atari 2600 Asterix GDI-I3 Score 759910 # 6
Atari Games Atari 2600 Asterix GDI-H3 Score 999999 # 1
Atari Games Atari 2600 Asteroids GDI-I3 Score 751970 # 2
Atari Games Atari 2600 Asteroids GDI-H3 Score 760005 # 1
Atari Games Atari 2600 Atlantis GDI-H3 Score 3837300 # 1
Atari Games Atari 2600 Atlantis GDI-I3 Score 3803000 # 2
Atari Games Atari 2600 Bank Heist GDI-I3 Score 1401 # 8
Atari Games Atari 2600 Bank Heist GDI-H3 Score 1380 # 9
Atari Games Atari 2600 Battle Zone GDI-H3 Score 824360 # 3
Atari Games Atari 2600 Battle Zone GDI-I3 Score 478830 # 5
Atari Games Atari 2600 Beam Rider GDI-H3 Score 422890 # 2
Atari Games Atari 2600 Beam Rider GDI-I3 Score 162100 # 6
Atari Games Atari 2600 Berzerk GDI-H3 Score 14649 # 7
Atari Games Atari 2600 Berzerk GDI-I3 Score 7607 # 9
Atari Games Atari 2600 Bowling GDI-H3 Score 205.2 # 5
Atari Games Atari 2600 Bowling GDI-I3 Score 201.9 # 6
Atari Games Atari 2600 Boxing GDI-I3 Score 100 # 1
Atari Games Atari 2600 Boxing GDI-H3 Score 100 # 1
Atari Games Atari 2600 Breakout GDI-I3 Score 864.00 # 1
Atari Games Atari 2600 Breakout GDI-H3 Score 864 # 1
Atari Games Atari 2600 Breakout GDI-I3(200M frames) Score 864.00 # 1
Atari Games Atari 2600 Breakout GDI-H3(200M frames) Score 864.00 # 1
Atari Games Atari 2600 Centipede GDI-H3 Score 195630 # 7
Atari Games Atari 2600 Centipede GDI-I3 Score 155830 # 8
Atari Games Atari 2600 Chopper Command GDI-I3 Score 999999 # 1
Atari Games Atari 2600 Chopper Command GDI-H3 Score 999999 # 1
Atari Games Atari 2600 Crazy Climber GDI-I3 Score 201000 # 8
Atari Games Atari 2600 Crazy Climber GDI-H3 Score 241170 # 5
Atari Games Atari 2600 Defender GDI-H3 Score 970540 # 2
Atari Games Atari 2600 Defender GDI-I3 Score 893110 # 3
Atari Games Atari 2600 Demon Attack GDI-I3 Score 675530 # 2
Atari Games Atari 2600 Demon Attack GDI-H3 Score 787985 # 1
Atari Games Atari 2600 Double Dunk GDI-H3 Score 24 # 1
Atari Games Atari 2600 Double Dunk GDI-I3 Score 24 # 1
Atari Games Atari 2600 Enduro GDI-H3 Score 14300 # 3
Atari Games Atari 2600 Enduro GDI-I3 Score 14330 # 1
Atari Games Atari 2600 Fishing Derby GDI-I3 Score 59 # 7
Atari Games Atari 2600 Fishing Derby GDI-H3 Score 65 # 5
Atari Games Atari 2600 Freeway GDI-H3(200M frames) Score 34 # 1
Atari Games Atari 2600 Freeway GDI-H3 Score 34 # 1
Atari Games Atari 2600 Freeway GDI-I3 Score 34 # 1
Atari Games Atari 2600 Frostbite GDI-H3(200M frames) Score 11330 # 7
Atari Games Atari 2600 Frostbite GDI-I3 Score 10485 # 9
Atari Games Atari 2600 Frostbite GDI-H3 Score 11330 # 7
Atari Games Atari 2600 Gopher GDI-H3 Score 473560 # 2
Atari Games Atari 2600 Gopher GDI-I3 Score 488830 # 1
Atari Games Atari 2600 Gravitar GDI-I3 Score 5905 # 8
Atari Games Atari 2600 Gravitar GDI-H3 Score 5915 # 7
Atari Games Atari 2600 HERO GDI-H3 Score 38225 # 7
Atari Games Atari 2600 HERO GDI-I3 Score 38330 # 5
Atari Games Atari 2600 Ice Hockey GDI-I3 Score 44.94 # 5
Atari Games Atari 2600 Ice Hockey GDI-H3 Score 481.9 # 1
Atari Games Atari 2600 James Bond GDI-I3 Score 594500 # 2
Atari Games Atari 2600 James Bond GDI-H3 Score 620780 # 1
Atari Games Atari 2600 Kangaroo GDI-H3 Score 14636 # 9
Atari Games Atari 2600 Kangaroo GDI-I3 Score 14500 # 10
Atari Games Atari 2600 Krull GDI-H3 Score 594540 # 1
Atari Games Atari 2600 Krull GDI-I3 Score 97575 # 5
Atari Games Atari 2600 Kung-Fu Master GDI-H3 Score 1666665 # 1
Atari Games Atari 2600 Kung-Fu Master GDI-I3 Score 140440 # 6
Atari Games Atari 2600 Montezuma's Revenge GDI-I3 Score 3000 # 11
Atari Games Atari 2600 Montezuma's Revenge GDI-H3 Score 2500 # 15
Atari Games Atari 2600 Ms. Pacman GDI-I3 Score 11536 # 7
Atari Games Atari 2600 Ms. Pacman GDI-H3 Score 11573 # 6
Atari Games Atari 2600 Name This Game GDI-I3 Score 34434 # 6
Atari Games Atari 2600 Name This Game GDI-H3 Score 36296 # 5
Atari Games Atari 2600 Phoenix GDI-I3 Score 894460 # 4
Atari Games Atari 2600 Phoenix GDI-H3 Score 959580 # 1
Atari Games Atari 2600 Pitfall! GDI-H3 Score -4.345 # 20
Atari Games Atari 2600 Pitfall! GDI-I3 Score 0 # 4
Atari Games Atari 2600 Pong GDI-H3(200M frames) Score 21.0 # 1
Atari Games Atari 2600 Pong GDI-H3 Score 21 # 1
Atari Games Atari 2600 Pong GDI-I3 Score 21 # 1
Atari Games Atari 2600 Pong GDI-I3(200M frames) Score 21.0 # 1
Atari Games Atari 2600 Private Eye GDI-H3 Score 15100 # 5
Atari Games Atari 2600 Private Eye GDI-I3 Score 15100 # 5
Atari Games Atari 2600 Q*Bert GDI-H3 Score 28657 # 11
Atari Games Atari 2600 Q*Bert GDI-I3 Score 27800 # 13
Atari Games Atari 2600 Q*Bert GDI-H3(200M frames) Score 28657 # 11
Atari Games Atari 2600 River Raid GDI-H3 Score 28349 # 7
Atari Games Atari 2600 River Raid GDI-I3 Score 28075 # 8
Atari Games Atari 2600 Road Runner GDI-I3 Score 878600 # 2
Atari Games Atari 2600 Road Runner GDI-H3 Score 999999 # 1
Atari Games Atari 2600 Robotank GDI-I3 Score 108.2 # 4
Atari Games Atari 2600 Robotank GDI-H3 Score 113.4 # 3
Atari Games Atari 2600 Seaquest GDI-H3(200M frames) Score 1000000 # 1
Atari Games Atari 2600 Seaquest GDI-H3 Score 1000000 # 1
Atari Games Atari 2600 Seaquest GDI-I3 Score 943910 # 7
Atari Games Atari 2600 Skiing GDI-H3 Score -6025 # 3
Atari Games Atari 2600 Skiing GDI-I3 Score -6774 # 3
Atari Games Atari 2600 Solaris GDI-I3 Score 11074 # 6
Atari Games Atari 2600 Solaris GDI-H3 Score 9105 # 8
Atari Games Atari 2600 Space Invaders GDI-I3 Score 140460 # 3
Atari Games Atari 2600 Space Invaders GDI-H3(200M frames) Score 154380 # 1
Atari Games Atari 2600 Space Invaders GDI-H3 Score 154380 # 1
Atari Games Atari 2600 Star Gunner GDI-H3 Score 677590 # 3
Atari Games Atari 2600 Star Gunner GDI-I3 Score 465750 # 5
Atari Games Atari 2600 Surround GDI-H3 Score 2.606 # 11
Atari Games Atari 2600 Surround GDI-I3 Score -7.8 # 14
Atari Games Atari 2600 Tennis GDI-I3 Score 24 # 1
Atari Games Atari 2600 Tennis GDI-H3 Score 24 # 1
Atari Games Atari 2600 Time Pilot GDI-I3 Score 216770 # 6
Atari Games Atari 2600 Time Pilot GDI-H3 Score 450810 # 2
Atari Games Atari 2600 Tutankham GDI-H3 Score 418.2 # 5
Atari Games Atari 2600 Tutankham GDI-I3 Score 423.9 # 3
Atari Games Atari 2600 Up and Down GDI-I3 Score 986440 # 1
Atari Games Atari 2600 Up and Down GDI-H3 Score 966590 # 3
Atari Games Atari 2600 Venture GDI-H3(200M frames) Score 2000 # 6
Atari Games Atari 2600 Venture GDI-H3 Score 2000 # 6
Atari Games Atari 2600 Venture GDI-I3 Score 2035 # 5
Atari Games Atari 2600 Video Pinball GDI-I3 Score 925830 # 6
Atari Games Atari 2600 Video Pinball GDI-H3 Score 978190 # 4
Atari Games Atari 2600 Wizard of Wor GDI-I3 Score 64239 # 6
Atari Games Atari 2600 Wizard of Wor GDI-H3 Score 63735 # 7
Atari Games Atari 2600 Yars Revenge GDI-I3 Score 972000 # 3
Atari Games Atari 2600 Yars Revenge GDI-H3 Score 968090 # 4
Atari Games Atari 2600 Zaxxon GDI-H3 Score 216020 # 4
Atari Games Atari 2600 Zaxxon GDI-I3 Score 109140 # 6
Atari Games Atari-57 GDI-H3 Human World Record Breakthrough 22 # 2
Mean Human Normalized Score 9620.33% # 3
Atari Games Atari-57 GDI-I3 Human World Record Breakthrough 17 # 5
Mean Human Normalized Score 7810.1% # 4
Atari Games atari game GDI-H3 Human World Record Breakthrough 22 # 1
Atari Games atari game GDI-I3 Human World Record Breakthrough 17 # 4
Atari Games Atari games GDI-I3 Mean Human Normalized Score 7810.1% # 2
Medium Human-Normalized Score 832.5% # 2
Atari Games Atari games GDI-H3 Mean Human Normalized Score 9620.33% # 1
Medium Human-Normalized Score 1146.39% # 1

Methods