Search Results for author: Voot Tangkaratt

Found 18 papers, 7 papers with code

Variational Imitation Learning with Diverse-quality Demonstrations

1 code implementation ICML 2020 Voot Tangkaratt, Bo Han, Mohammad Emtiyaz Khan, Masashi Sugiyama

Learning from demonstrations can be challenging when the quality of demonstrations is diverse, and even more so when the quality is unknown and there is no additional information to estimate the quality.

Continuous Control Imitation Learning +2

Discovering Diverse Solutions in Deep Reinforcement Learning by Maximizing State-Action-Based Mutual Information

1 code implementation12 Mar 2021 Takayuki Osa, Voot Tangkaratt, Masashi Sugiyama

In our method, a policy conditioned on a continuous or discrete latent variable is trained by directly maximizing the variational lower bound of the mutual information, instead of using the mutual information as unsupervised rewards as in previous studies.

Continuous Control

Robust Imitation Learning from Noisy Demonstrations

1 code implementation20 Oct 2020 Voot Tangkaratt, Nontawat Charoenphakdee, Masashi Sugiyama

Robust learning from noisy demonstrations is a practical but highly challenging problem in imitation learning.

Classification Continuous Control +2

Meta-Model-Based Meta-Policy Optimization

no code implementations4 Jun 2020 Takuya Hiraoka, Takahisa Imagawa, Voot Tangkaratt, Takayuki Osa, Takashi Onishi, Yoshimasa Tsuruoka

Model-based meta-reinforcement learning (RL) methods have recently been shown to be a promising approach to improving the sample efficiency of RL in multi-task settings.

Continuous Control Meta-Learning +3

VILD: Variational Imitation Learning with Diverse-quality Demonstrations

no code implementations15 Sep 2019 Voot Tangkaratt, Bo Han, Mohammad Emtiyaz Khan, Masashi Sugiyama

However, the quality of demonstrations in reality can be diverse, since it is easier and cheaper to collect demonstrations from a mix of experts and amateurs.

Continuous Control Imitation Learning

TD-Regularized Actor-Critic Methods

1 code implementation19 Dec 2018 Simone Parisi, Voot Tangkaratt, Jan Peters, Mohammad Emtiyaz Khan

Actor-critic methods can achieve incredible performance on difficult reinforcement learning problems, but they are also prone to instability.

reinforcement-learning Reinforcement Learning (RL)

Active Deep Q-learning with Demonstration

no code implementations6 Dec 2018 Si-An Chen, Voot Tangkaratt, Hsuan-Tien Lin, Masashi Sugiyama

In this work, we propose Active Reinforcement Learning with Demonstration (ARLD), a new framework to streamline RL in terms of demonstration efforts by allowing the RL agent to query for demonstration actively during training.

Q-Learning reinforcement-learning +1

Improving Generative Adversarial Imitation Learning with Non-expert Demonstrations

no code implementations27 Sep 2018 Voot Tangkaratt, Masashi Sugiyama

Imitation learning aims to learn an optimal policy from expert demonstrations and its recent combination with deep learning has shown impressive performance.

Continuous Control Imitation Learning

Vprop: Variational Inference using RMSprop

no code implementations4 Dec 2017 Mohammad Emtiyaz Khan, Zuozhu Liu, Voot Tangkaratt, Yarin Gal

Overall, this paper presents Vprop as a principled, computationally-efficient, and easy-to-implement method for Bayesian deep learning.

Variational Inference

Variational Adaptive-Newton Method for Explorative Learning

no code implementations15 Nov 2017 Mohammad Emtiyaz Khan, Wu Lin, Voot Tangkaratt, Zuozhu Liu, Didrik Nielsen

We present the Variational Adaptive Newton (VAN) method which is a black-box optimization method especially suitable for explorative-learning tasks such as active learning and reinforcement learning.

Active Learning reinforcement-learning +2

Guide Actor-Critic for Continuous Control

1 code implementation ICLR 2018 Voot Tangkaratt, Abbas Abdolmaleki, Masashi Sugiyama

First, we show that GAC updates the guide actor by performing second-order optimization in the action space where the curvature matrix is based on the Hessians of the critic.

Continuous Control reinforcement-learning +1

Policy Search with High-Dimensional Context Variables

no code implementations10 Nov 2016 Voot Tangkaratt, Herke van Hoof, Simone Parisi, Gerhard Neumann, Jan Peters, Masashi Sugiyama

A naive application of unsupervised dimensionality reduction methods to the context variables, such as principal component analysis, is insufficient as task-relevant input may be ignored.

Dimensionality Reduction

Direct Estimation of the Derivative of Quadratic Mutual Information with Application in Supervised Dimension Reduction

no code implementations5 Aug 2015 Voot Tangkaratt, Hiroaki Sasaki, Masashi Sugiyama

On the other hand, quadratic MI (QMI) is a variant of MI based on the $L_2$ distance which is more robust against outliers than the KL divergence, and a computationally efficient method to estimate QMI from data, called least-squares QMI (LSQMI), has been proposed recently.

Dimensionality Reduction

Conditional Density Estimation with Dimensionality Reduction via Squared-Loss Conditional Entropy Minimization

no code implementations28 Apr 2014 Voot Tangkaratt, Ning Xie, Masashi Sugiyama

In such a case, estimating the conditional density itself is preferable, but conditional density estimation (CDE) is challenging in high-dimensional space.

Density Estimation Dimensionality Reduction +1

Cannot find the paper you are looking for? You can Submit a new open access paper.