Search Results for author: Takamitsu Matsubara

Found 14 papers, 2 papers with code

Ensuring Monotonic Policy Improvement in Entropy-regularized Value-based Reinforcement Learning

no code implementations25 Aug 2020 Lingwei Zhu, Takamitsu Matsubara

We propose a novel reinforcement learning algorithm that exploits this lower-bound as a criterion for adjusting the degree of a policy update for alleviating policy oscillation.

reinforcement-learning Reinforcement Learning (RL)

Uncertainty-aware Contact-safe Model-based Reinforcement Learning

no code implementations16 Oct 2020 Cheng-Yu Kuo, Andreas Schaarschmidt, Yunduan Cui, Tamim Asfour, Takamitsu Matsubara

In typical MBRL, we cannot expect the data-driven model to generate accurate and reliable policies to the intended robotic tasks during the learning process due to sample scarcity.

Model-based Reinforcement Learning reinforcement-learning +1

Deep reinforcement learning of event-triggered communication and control for multi-agent cooperative transport

no code implementations29 Mar 2021 Kazuki Shibata, Tomohiko Jimbo, Takamitsu Matsubara

In this paper, we explore a multi-agent reinforcement learning approach to address the design problem of communication and control strategies for multi-agent cooperative transport.

Multi-agent Reinforcement Learning reinforcement-learning +1

Cautious Actor-Critic

no code implementations12 Jul 2021 Lingwei Zhu, Toshinori Kitamura, Takamitsu Matsubara

The oscillating performance of off-policy learning and persisting errors in the actor-critic (AC) setting call for algorithms that can conservatively learn to suit the stability-critical applications better.

Continuous Control

Cautious Policy Programming: Exploiting KL Regularization in Monotonic Policy Improvement for Reinforcement Learning

no code implementations13 Jul 2021 Lingwei Zhu, Toshinori Kitamura, Takamitsu Matsubara

In this paper, we propose cautious policy programming (CPP), a novel value-based reinforcement learning (RL) algorithm that can ensure monotonic policy improvement during learning.

Atari Games reinforcement-learning +1

Geometric Value Iteration: Dynamic Error-Aware KL Regularization for Reinforcement Learning

no code implementations16 Jul 2021 Toshinori Kitamura, Lingwei Zhu, Takamitsu Matsubara

The recent boom in the literature on entropy-regularized reinforcement learning (RL) approaches reveals that Kullback-Leibler (KL) regularization brings advantages to RL algorithms by canceling out errors under mild assumptions.

reinforcement-learning Reinforcement Learning (RL)

AdaTerm: Adaptive T-Distribution Estimated Robust Moments for Noise-Robust Stochastic Gradient Optimization

1 code implementation18 Jan 2022 Wendyam Eric Lionel Ilboudo, Taisuke Kobayashi, Takamitsu Matsubara

In this paper, we propose AdaTerm, a novel approach that incorporates the Student's t-distribution to derive not only the first-order moment but also all the associated statistics.

$q$-Munchausen Reinforcement Learning

no code implementations16 May 2022 Lingwei Zhu, Zheng Chen, Eiji Uchibe, Takamitsu Matsubara

The recently successful Munchausen Reinforcement Learning (M-RL) features implicit Kullback-Leibler (KL) regularization by augmenting the reward function with logarithm of the current stochastic policy.

reinforcement-learning Reinforcement Learning (RL)

Enforcing KL Regularization in General Tsallis Entropy Reinforcement Learning via Advantage Learning

no code implementations16 May 2022 Lingwei Zhu, Zheng Chen, Eiji Uchibe, Takamitsu Matsubara

Maximum Tsallis entropy (MTE) framework in reinforcement learning has gained popularity recently by virtue of its flexible modeling choices including the widely used Shannon entropy and sparse entropy.

reinforcement-learning Reinforcement Learning (RL)

Randomized-to-Canonical Model Predictive Control for Real-world Visual Robotic Manipulation

no code implementations5 Jul 2022 Tomoya Yamanokuchi, Yuhwan Kwon, Yoshihisa Tsurumine, Eiji Uchibe, Jun Morimoto, Takamitsu Matsubara

However, such works are limited to one-shot transfer, where real-world data must be collected once to perform the sim-to-real transfer, which remains a significant human effort in transferring the models learned in simulations to new domains in the real world.

Model Predictive Control

Physically Consistent Preferential Bayesian Optimization for Food Arrangement

no code implementations21 Sep 2022 Yuhwan Kwon, Yoshihisa Tsurumine, Takeshi Shimmura, Sadao Kawamura, Takamitsu Matsubara

To cope with this problem, we propose Physically Consistent Preferential Bayesian Optimization (PCPBO) as a method that obtains physically feasible and preferred arrangements that satisfy domain rules.

Bayesian Optimization

Disturbance Injection under Partial Automation: Robust Imitation Learning for Long-horizon Tasks

no code implementations22 Mar 2023 Hirotaka Tahara, Hikaru Sasaki, Hanbit Oh, Edgar Anarossi, Takamitsu Matsubara

Under PA, operators perform manual operations (providing actions) and operations that switch to automatic/manual mode (mode-switching).

Imitation Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.