Search Results for author: Kenji Doya

Found 19 papers, 5 papers with code

A Generalized Natural Actor-Critic Algorithm

no code implementations • NeurIPS 2009 • Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto, Kenji Doya

In this paper, we describe a generalized Natural Gradient (gNG) by linearly interpolating the two FIMs and propose an efficient implementation for the gNG learning based on a theory of the estimating function, generalized Natural Actor-Critic (gNAC).

Reinforcement Learning (RL)

Paper
Add Code

Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions

no code implementations • 21 Oct 2015 • Tomoki Tokuda, Junichiro Yoshimoto, Yu Shimizu, Shigeru Toki, Go Okada, Masahiro Takamura, Tetsuya Yamamoto, Shinpei Yoshimura, Yasumasa Okamoto, Shigeto Yamawaki, Kenji Doya

We propose a novel method for multiple clustering that assumes a co-clustering structure (partitions in both rows and columns of the data matrix) in each view.

Clustering Variational Inference

Paper
Add Code

Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning

no code implementations • 10 Feb 2017 • Stefan Elfwing, Eiji Uchibe, Kenji Doya

First, we propose two activation functions for neural network function approximation in reinforcement learning: the sigmoid-weighted linear unit (SiLU) and its derivative function (dSiLU).

Atari Games reinforcement-learning +1

Paper
Add Code

Online Meta-learning by Parallel Algorithm Competition

no code implementations • 24 Feb 2017 • Stefan Elfwing, Eiji Uchibe, Kenji Doya

In the OMPAC method, several instances of a reinforcement learning algorithm are run in parallel with small differences in the initial values of the meta-parameters.

Atari Games Meta-Learning +3

Paper
Add Code

Unifying Value Iteration, Advantage Learning, and Dynamic Policy Programming

no code implementations • 30 Oct 2017 • Tadashi Kozuno, Eiji Uchibe, Kenji Doya

Approximate dynamic programming algorithms, such as approximate value iteration, have been successfully applied to many complex reinforcement learning tasks, and a better approximate dynamic programming algorithm is expected to further extend the applicability of reinforcement learning to various tasks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Unbounded Output Networks for Classification

no code implementations • 25 Jul 2018 • Stefan Elfwing, Eiji Uchibe, Kenji Doya

In this study, by adopting features of the EE-RBM approach to feed-forward neural networks, we propose the UnBounded output network (UBnet) which is characterized by three features: (1) unbounded output units; (2) the target value of correct classification is set to a value much greater than one; and (3) the models are trained by a modified mean-squared error objective.

Classification General Classification

Paper
Add Code

Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks

1 code implementation • 29 Jan 2019 • Dongqi Han, Kenji Doya, Jun Tani

Furthermore, we show that the self-developed compositionality of the network enhances faster re-learning when adapting to a new task that is a re-composition of previously learned sub-goals, than when starting from scratch.

Continuous Control Meta-Learning +1

Paper
Code

PIPPS: Flexible Model-Based Policy Search Robust to the Curse of Chaos

3 code implementations • ICML 2018 • Paavo Parmas, Carl Edward Rasmussen, Jan Peters, Kenji Doya

Previously, the exploding gradient problem has been explained to be central in deep learning and model-based reinforcement learning, because it causes numerical issues and instability in optimization.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Code

Gap-Increasing Policy Evaluation for Efficient and Noise-Tolerant Reinforcement Learning

no code implementations • 18 Jun 2019 • Tadashi Kozuno, Dongqi Han, Kenji Doya

We provide detailed theoretical analysis of the new algorithm that shows its efficiency and noise-tolerance inherited from Retrace and advantage learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

MarmoNet: a pipeline for automated projection mapping of the common marmoset brain from whole-brain serial two-photon tomography

no code implementations • 2 Aug 2019 • Henrik Skibbe, Akiya Watakabe, Ken Nakae, Carlos Enrique Gutierrez, Hiromichi Tsukada, Junichi Hata, Takashi Kawase, Rui Gong, Alexander Woodward, Kenji Doya, Hideyuki Okano, Tetsuo Yamamori, Shin Ishii

Understanding the connectivity in the brain is an important prerequisite for understanding how the brain processes information.

Image Registration

Paper
Add Code

Variational Recurrent Models for Solving Partially Observable Control Tasks

1 code implementation • ICLR 2020 • Dongqi Han, Kenji Doya, Jun Tani

In partially observable (PO) environments, deep reinforcement learning (RL) agents often suffer from unsatisfactory performance, since two problems need to be tackled together: how to extract information from the raw observations to solve the task, and how to improve the policy.

Memorization Reinforcement Learning (RL)

Paper
Code

Forward and inverse reinforcement learning sharing network weights and hyperparameters

no code implementations • 17 Aug 2020 • Eiji Uchibe, Kenji Doya

A forward RL step minimizes the reverse KL estimated by the inverse RL step.

Imitation Learning reinforcement-learning +1

Paper
Add Code

A Whole Brain Probabilistic Generative Model: Toward Realizing Cognitive Architectures for Developmental Robots

no code implementations • 15 Mar 2021 • Tadahiro Taniguchi, Hiroshi Yamakawa, Takayuki Nagai, Kenji Doya, Masamichi Sakagami, Masahiro Suzuki, Tomoaki Nakamura, Akira Taniguchi

This approach is based on two ideas: (1) brain-inspired AI, learning human brain architecture to build human-level intelligence, and (2) a probabilistic generative model(PGM)-based cognitive system to develop a cognitive system for developmental robots by integrating PGMs.

Paper
Add Code

Canonical Cortical Circuits and the Duality of Bayesian Inference and Optimal Control

no code implementations • 5 Jun 2021 • Kenji Doya

Here we consider the hypothesis that the sensory and motor cortical circuits implement the dual computations for Bayesian inference and optimal control, or perceptual and value-based decision making, respectively.

Bayesian Inference Decision Making

Paper
Add Code

Goal-Directed Planning by Reinforcement Learning and Active Inference

no code implementations • 18 Jun 2021 • Dongqi Han, Kenji Doya, Jun Tani

Habitual behavior, which is obtained from the prior distribution of ${z}$, is acquired by reinforcement learning.

Bayesian Inference Decision Making +2

Paper
Add Code

Variational oracle guiding for reinforcement learning

no code implementations • ICLR 2022 • Dongqi Han, Tadashi Kozuno, Xufang Luo, Zhao-Yun Chen, Kenji Doya, Yuqing Yang, Dongsheng Li

How to make intelligent decisions is a central problem in machine learning and cognitive science.

Decision Making Offline RL +2

Paper
Add Code

Tabular Data Imputation: Choose KNN over Deep Learning

no code implementations • 29 Sep 2021 • Florian Lalande, Kenji Doya

As databases are ubiquitous nowadays, missing values constitute a pervasive problem for data analysis.

Common Sense Reasoning Imputation

Paper
Add Code

Habits and goals in synergy: a variational Bayesian framework for behavior

1 code implementation • 11 Apr 2023 • Dongqi Han, Kenji Doya, Dongsheng Li, Jun Tani

The habitual behavior is generated by using prior distribution of intention, which is goal-less; and the goal-directed behavior is generated by the posterior distribution of intention, which is conditioned on the goal.

Paper
Code

Numerical Data Imputation for Multimodal Data Sets: A Probabilistic Nearest-Neighbor Kernel Density Approach

1 code implementation • 29 Jun 2023 • Florian Lalande, Kenji Doya

We compare our method with previous data imputation methods using artificial and real-world data with different data missing scenarios and various data missing rates, and show that our method can cope with complex original data structure, yields lower data imputation errors, and provides probabilistic estimates with higher likelihood than current methods.

Density Estimation Imputation

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.