1 code implementation • 10 Dec 2023 • Takuya Hiraoka
The simplified REDQ with our modifications achieves $\sim 8 \times$ better sample efficiency than the SoTA methods in 4 Fetch tasks of Robotics.
no code implementations • 21 May 2023 • Takahisa Imagawa, Takuya Hiraoka, Yoshimasa Tsuruoka
However, most of the existing methods learn a finite number of discrete skills, and thus the variety of behaviors that can be exhibited with the learned skills is limited.
1 code implementation • 26 Jan 2023 • Takuya Hiraoka, Takashi Onishi, Yoshimasa Tsuruoka
In reinforcement learning (RL) with experience replay, experiences stored in a replay buffer influence the RL agent's performance.
2 code implementations • ICLR 2022 • Takuya Hiraoka, Takahisa Imagawa, Taisei Hashimoto, Takashi Onishi, Yoshimasa Tsuruoka
To make REDQ more computationally efficient, we propose a method of improving computational efficiency called DroQ, which is a variant of REDQ that uses a small ensemble of dropout Q-functions.
no code implementations • ICML Workshop LifelongML 2020 • Takahisa Imagawa, Takuya Hiraoka, Yoshimasa Tsuruoka
It learns a belief model over the embedding space and a belief-conditional policy and Q-function.
no code implementations • 4 Jun 2020 • Takuya Hiraoka, Takahisa Imagawa, Voot Tangkaratt, Takayuki Osa, Takashi Onishi, Yoshimasa Tsuruoka
Model-based meta-reinforcement learning (RL) methods have recently been shown to be a promising approach to improving the sample efficiency of RL in multi-task settings.
no code implementations • IJCNLP 2019 • Kosuke Akimoto, Takuya Hiraoka, Kunihiko Sadamasa, Mathias Niepert
Most existing relation extraction approaches exclusively target binary relations, and n-ary relation extraction is relatively unexplored.
no code implementations • 25 Jun 2019 • Takahisa Imagawa, Takuya Hiraoka, Yoshimasa Tsuruoka
Reinforcement Learning, a machine learning framework for training an autonomous agent based on rewards, has shown outstanding results in various domains.
1 code implementation • NeurIPS 2019 • Takuya Hiraoka, Takahisa Imagawa, Tatsuya Mori, Takashi Onishi, Yoshimasa Tsuruoka
While there are several methods to learn options that are robust against the uncertainty of model parameters, these methods only consider either the worst case or the average (ordinary) case for learning options.
no code implementations • 26 Nov 2018 • Hisao Katsumi, Takuya Hiraoka, Koichiro Yoshino, Kazeto Yamamoto, Shota Motoura, Kunihiko Sadamasa, Satoshi Nakamura
It is required that these systems have sufficient supporting information to argue their claims rationally; however, the systems often do not have enough of such information in realistic situations.
no code implementations • 29 Sep 2018 • Takuya Hiraoka, Takashi Onishi, Takahisa Imagawa, Yoshimasa Tsuruoka
In this paper, we propose a framework that can automatically refine symbol grounding functions and a high-level planner to reduce human effort for designing these modules.
no code implementations • 7 Sep 2018 • Seydou Ba, Takuya Hiraoka, Takashi Onishi, Toru Nakata, Yoshimasa Tsuruoka
The evaluation results show that, with variable simulation times, the proposed approach outperforms the conventional MCTS in the evaluated continuous decision space tasks and improves the performance of MCTS in most of the ALE tasks.
no code implementations • 2 Aug 2017 • Takuya Hiraoka, Masaaki Tsuchida, Yotaro Watanabe
This paper is the first attempt to learn the policy of an inquiry dialog system (IDS) by using deep reinforcement learning (DRL).