no code implementations • 30 Jan 2024 • Ryoma Furuyama, Daiki Kuyoshi, Satoshi Yamane
In order to make this algorithm more robust to distribution shift, we propose more efficient and robust algorithm by adding to this method a reward function based on adversarial inverse reinforcement learning that rewards the agent for performing actions in status similar to the demo.
1 code implementation • 19 Jan 2020 • Daichi Nishio, Daiki Kuyoshi, Toi Tsuneda, Satoshi Yamane
The methods based on reinforcement learning, such as inverse reinforcement learning and generative adversarial imitation learning (GAIL), can learn from only a few expert data.