no code implementations • 11 Apr 2024 • Soichiro Nishimori, Xin-Qiang Cai, Johannes Ackermann, Masashi Sugiyama
In this paper, we investigate an offline reinforcement learning (RL) problem where datasets are collected from two domains.
1 code implementation • 31 Jan 2024 • Toshinori Kitamura, Tadashi Kozuno, Masahiro Kato, Yuki Ichihara, Soichiro Nishimori, Akiyoshi Sannai, Sho Sonoda, Wataru Kumagai, Yutaka Matsuo
We study a primal-dual reinforcement learning (RL) algorithm for the online constrained Markov decision processes (CMDP) problem, wherein the agent explores an optimal policy that maximizes return while satisfying constraints.
no code implementations • 19 Apr 2023 • Soichiro Nishimori, Sotetsu Koyamada, Shin Ishii
We proposed an RL algorithm that estimates the hidden states by end-to-end training, and visualize the estimation as a state-transition graph.
1 code implementation • NeurIPS 2023 • Sotetsu Koyamada, Shinri Okano, Soichiro Nishimori, Yu Murata, Keigo Habara, Haruka Kita, Shin Ishii
We propose Pgx, a suite of board game reinforcement learning (RL) environments written in JAX and optimized for GPU/TPU accelerators.