no code implementations • 6 Feb 2024 • Tsunehiko Tanaka, Kenshi Abe, Kaito Ariu, Tetsuro Morimura, Edgar Simo-Serra
Traditional approaches in offline reinforcement learning aim to learn the optimal policy that maximizes the cumulative reward, also known as return.
1 code implementation • 29 Nov 2022 • Tsunehiko Tanaka, Daiki Kimura, Michiaki Tatsubori
We propose a novel agent, DiffG-RL, which constructs a Difference Graph that organizes the environment states and common sense by means of interactive objects with a dedicated graph encoder.
no code implementations • 19 Oct 2022 • Tsunehiko Tanaka, Daiki Kimura, Michiaki Tatsubori
They are usually imperfect information games, and their interactions are only in the textual modality.