Search Results for author: Dongho Kim

Found 9 papers, 2 papers with code

Batch Reinforcement Learning with Hyperparameter Gradients

no code implementations ICML 2020 Byung-Jun Lee, Jongmin Lee, Peter Vrancx, Dongho Kim, Kee-Eung Kim

We consider the batch reinforcement learning problem where the agent needs to learn only from a fixed batch of data, without further interaction with the environment.

continuous-control Continuous Control +3

Bellman: A Toolbox for Model-Based Reinforcement Learning in TensorFlow

2 code implementations26 Mar 2021 John McLeod, Hrvoje Stojic, Vincent Adam, Dongho Kim, Jordi Grau-Moya, Peter Vrancx, Felix Leibfried

This paves the way for new research directions, e. g. investigating uncertainty-aware environment models that are not necessarily neural-network-based, or developing algorithms to solve industrially-motivated benchmarks that share characteristics with real-world problems.

Model-based Reinforcement Learning reinforcement-learning +2

Policy Optimization Through Approximate Importance Sampling

1 code implementation9 Oct 2019 Marcin B. Tomczak, Dongho Kim, Peter Vrancx, Kee-Eung Kim

These proxy objectives allow stable and low variance policy learning, but require small policy updates to ensure that the proxy objective remains an accurate approximation of the target policy value.

continuous-control Continuous Control

Learning from Real Users: Rating Dialogue Success with Neural Networks for Reinforcement Learning in Spoken Dialogue Systems

no code implementations13 Aug 2015 Pei-Hao Su, David Vandyke, Milica Gasic, Dongho Kim, Nikola Mrksic, Tsung-Hsien Wen, Steve Young

The models are trained on dialogues generated by a simulated user and the best model is then used to train a policy on-line which is shown to perform at least as well as a baseline system using prior knowledge of the user's task.

Reinforcement Learning Spoken Dialogue Systems

Stochastic Language Generation in Dialogue using Recurrent Neural Networks with Convolutional Sentence Reranking

no code implementations WS 2015 Tsung-Hsien Wen, Milica Gasic, Dongho Kim, Nikola Mrksic, Pei-Hao Su, David Vandyke, Steve Young

The natural language generation (NLG) component of a spoken dialogue system (SDS) usually needs a substantial amount of handcrafting or a well-labeled dataset to be trained on.

Sentence Text Generation

Cost-Sensitive Exploration in Bayesian Reinforcement Learning

no code implementations NeurIPS 2012 Dongho Kim, Kee-Eung Kim, Pascal Poupart

In this paper, we consider Bayesian reinforcement learning (BRL) where actions incur costs in addition to rewards, and thus exploration has to be constrained in terms of the expected total cost while learning to maximize the expected long-term total reward.

reinforcement-learning Reinforcement Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.