Search Results for author: Kenji Doya

Found 19 papers, 5 papers with code

A Generalized Natural Actor-Critic Algorithm

no code implementations NeurIPS 2009 Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto, Kenji Doya

In this paper, we describe a generalized Natural Gradient (gNG) by linearly interpolating the two FIMs and propose an efficient implementation for the gNG learning based on a theory of the estimating function, generalized Natural Actor-Critic (gNAC).

Reinforcement Learning (RL)

Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning

no code implementations10 Feb 2017 Stefan Elfwing, Eiji Uchibe, Kenji Doya

First, we propose two activation functions for neural network function approximation in reinforcement learning: the sigmoid-weighted linear unit (SiLU) and its derivative function (dSiLU).

Atari Games reinforcement-learning +1

Online Meta-learning by Parallel Algorithm Competition

no code implementations24 Feb 2017 Stefan Elfwing, Eiji Uchibe, Kenji Doya

In the OMPAC method, several instances of a reinforcement learning algorithm are run in parallel with small differences in the initial values of the meta-parameters.

Atari Games Meta-Learning +3

Unifying Value Iteration, Advantage Learning, and Dynamic Policy Programming

no code implementations30 Oct 2017 Tadashi Kozuno, Eiji Uchibe, Kenji Doya

Approximate dynamic programming algorithms, such as approximate value iteration, have been successfully applied to many complex reinforcement learning tasks, and a better approximate dynamic programming algorithm is expected to further extend the applicability of reinforcement learning to various tasks.

reinforcement-learning Reinforcement Learning (RL)

Unbounded Output Networks for Classification

no code implementations25 Jul 2018 Stefan Elfwing, Eiji Uchibe, Kenji Doya

In this study, by adopting features of the EE-RBM approach to feed-forward neural networks, we propose the UnBounded output network (UBnet) which is characterized by three features: (1) unbounded output units; (2) the target value of correct classification is set to a value much greater than one; and (3) the models are trained by a modified mean-squared error objective.

Classification General Classification

Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks

1 code implementation29 Jan 2019 Dongqi Han, Kenji Doya, Jun Tani

Furthermore, we show that the self-developed compositionality of the network enhances faster re-learning when adapting to a new task that is a re-composition of previously learned sub-goals, than when starting from scratch.

Continuous Control Meta-Learning +1

PIPPS: Flexible Model-Based Policy Search Robust to the Curse of Chaos

3 code implementations ICML 2018 Paavo Parmas, Carl Edward Rasmussen, Jan Peters, Kenji Doya

Previously, the exploding gradient problem has been explained to be central in deep learning and model-based reinforcement learning, because it causes numerical issues and instability in optimization.

Model-based Reinforcement Learning reinforcement-learning +1

Gap-Increasing Policy Evaluation for Efficient and Noise-Tolerant Reinforcement Learning

no code implementations18 Jun 2019 Tadashi Kozuno, Dongqi Han, Kenji Doya

We provide detailed theoretical analysis of the new algorithm that shows its efficiency and noise-tolerance inherited from Retrace and advantage learning.

reinforcement-learning Reinforcement Learning (RL)

Variational Recurrent Models for Solving Partially Observable Control Tasks

1 code implementation ICLR 2020 Dongqi Han, Kenji Doya, Jun Tani

In partially observable (PO) environments, deep reinforcement learning (RL) agents often suffer from unsatisfactory performance, since two problems need to be tackled together: how to extract information from the raw observations to solve the task, and how to improve the policy.

Memorization Reinforcement Learning (RL)

A Whole Brain Probabilistic Generative Model: Toward Realizing Cognitive Architectures for Developmental Robots

no code implementations15 Mar 2021 Tadahiro Taniguchi, Hiroshi Yamakawa, Takayuki Nagai, Kenji Doya, Masamichi Sakagami, Masahiro Suzuki, Tomoaki Nakamura, Akira Taniguchi

This approach is based on two ideas: (1) brain-inspired AI, learning human brain architecture to build human-level intelligence, and (2) a probabilistic generative model(PGM)-based cognitive system to develop a cognitive system for developmental robots by integrating PGMs.

Canonical Cortical Circuits and the Duality of Bayesian Inference and Optimal Control

no code implementations5 Jun 2021 Kenji Doya

Here we consider the hypothesis that the sensory and motor cortical circuits implement the dual computations for Bayesian inference and optimal control, or perceptual and value-based decision making, respectively.

Bayesian Inference Decision Making

Goal-Directed Planning by Reinforcement Learning and Active Inference

no code implementations18 Jun 2021 Dongqi Han, Kenji Doya, Jun Tani

Habitual behavior, which is obtained from the prior distribution of ${z}$, is acquired by reinforcement learning.

Bayesian Inference Decision Making +2

Tabular Data Imputation: Choose KNN over Deep Learning

no code implementations29 Sep 2021 Florian Lalande, Kenji Doya

As databases are ubiquitous nowadays, missing values constitute a pervasive problem for data analysis.

Common Sense Reasoning Imputation

Habits and goals in synergy: a variational Bayesian framework for behavior

1 code implementation11 Apr 2023 Dongqi Han, Kenji Doya, Dongsheng Li, Jun Tani

The habitual behavior is generated by using prior distribution of intention, which is goal-less; and the goal-directed behavior is generated by the posterior distribution of intention, which is conditioned on the goal.

Numerical Data Imputation for Multimodal Data Sets: A Probabilistic Nearest-Neighbor Kernel Density Approach

1 code implementation29 Jun 2023 Florian Lalande, Kenji Doya

We compare our method with previous data imputation methods using artificial and real-world data with different data missing scenarios and various data missing rates, and show that our method can cope with complex original data structure, yields lower data imputation errors, and provides probabilistic estimates with higher likelihood than current methods.

Density Estimation Imputation

Cannot find the paper you are looking for? You can Submit a new open access paper.