Search Results for author: Kee-Eung Kim

Found 36 papers, 10 papers with code

Batch Reinforcement Learning with Hyperparameter Gradients

no code implementations ICML 2020 Byung-Jun Lee, Jongmin Lee, Peter Vrancx, Dongho Kim, Kee-Eung Kim

We consider the batch reinforcement learning problem where the agent needs to learn only from a fixed batch of data, without further interaction with the environment.

Continuous Control reinforcement-learning +1

Learning to Embed Multi-Modal Contexts for Situated Conversational Agents

no code implementations Findings (NAACL) 2022 Haeju Lee, Oh Joon Kwon, Yunseon Choi, Minho Park, Ran Han, Yoonhyung Kim, Jinhyeon Kim, Youngjune Lee, Haebin Shin, Kangwook Lee, Kee-Eung Kim

The Situated Interactive Multi-Modal Conversations (SIMMC) 2. 0 aims to create virtual shopping assistants that can accept complex multi-modal inputs, i. e. visual appearances of objects and user utterances.

coreference-resolution dialog state tracking +3

Bayesian Multi-Task Transfer Learning for Soft Prompt Tuning

1 code implementation13 Feb 2024 Haeju Lee, Minchan Jeong, Se-Young Yun, Kee-Eung Kim

We argue that when we extract knowledge from source tasks via training source prompts, we need to consider this correlation among source tasks for better transfer to target tasks.

Transfer Learning

Stitching Sub-Trajectories with Conditional Diffusion Model for Goal-Conditioned Offline RL

1 code implementation11 Feb 2024 Sungyoon Kim, Yunseon Choi, Daiki E. Matsunaga, Kee-Eung Kim

In this paper, we propose SSD (Sub-trajectory Stitching with Diffusion), a model-based offline GCRL method that leverages the conditional diffusion model to address these limitations.

Offline RL

Adapting Text-based Dialogue State Tracker for Spoken Dialogues

no code implementations29 Aug 2023 Jaeseok Yoon, Seunghyun Hwang, Ran Han, Jeonguk Bang, Kee-Eung Kim

Although there have been remarkable advances in dialogue systems through the dialogue systems technology competition (DSTC), it remains one of the key challenges to building a robust task-oriented dialogue system with a speech interface.

Automatic Speech Recognition Data Augmentation +2

Local Metric Learning for Off-Policy Evaluation in Contextual Bandits with Continuous Actions

1 code implementation24 Oct 2022 Haanvid Lee, Jongmin Lee, Yunseon Choi, Wonseok Jeon, Byung-Jun Lee, Yung-Kyun Noh, Kee-Eung Kim

We consider local kernel metric learning for off-policy evaluation (OPE) of deterministic policies in contextual bandits with continuous action spaces.

Metric Learning Multi-Armed Bandits +1

PAC-Net: A Model Pruning Approach to Inductive Transfer Learning

no code implementations12 Jun 2022 Sanghoon Myung, In Huh, Wonik Jang, Jae Myung Choe, Jisu Ryu, Dae Sin Kim, Kee-Eung Kim, Changwook Jeong

Inductive transfer learning aims to learn from a small amount of training data for the target task by utilizing a pre-trained model from the source task.

Transfer Learning

COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation

1 code implementation ICLR 2022 Jongmin Lee, Cosmin Paduraru, Daniel J. Mankowitz, Nicolas Heess, Doina Precup, Kee-Eung Kim, Arthur Guez

We consider the offline constrained reinforcement learning (RL) problem, in which the agent aims to compute a policy that maximizes expected return while satisfying given cost constraints, learning only from a pre-collected dataset.

Offline RL Off-policy evaluation +1

LobsDICE: Offline Learning from Observation via Stationary Distribution Correction Estimation

2 code implementations28 Feb 2022 Geon-Hyeong Kim, Jongmin Lee, Youngsoo Jang, Hongseok Yang, Kee-Eung Kim

We consider the problem of learning from observation (LfO), in which the agent aims to mimic the expert's behavior from the state-only demonstrations by experts.

Imitation Learning

Multi-View Representation Learning via Total Correlation Objective

no code implementations NeurIPS 2021 HyeongJoo Hwang, Geon-Hyeong Kim, Seunghoon Hong, Kee-Eung Kim

Multi-View Representation Learning (MVRL) aims to discover a shared representation of observations from different views with the complex underlying correlation.

Representation Learning Translation

Offline Reinforcement Learning for Large Scale Language Action Spaces

no code implementations ICLR 2022 Youngsoo Jang, Jongmin Lee, Kee-Eung Kim

GPT-Critic is essentially free from the issue of diverging from human language since it learns from the sentences sampled from the pre-trained language model.

Language Modelling Offline RL +2

Structure-Aware Transformer Policy for Inhomogeneous Multi-Task Reinforcement Learning

no code implementations ICLR 2022 Sunghoon Hong, Deunsol Yoon, Kee-Eung Kim

We empirically show that the morphological information is crucial for modular reinforcement learning, substantially outperforming prior state-of-the-art methods on multi-task learning as well as transfer learning settings with different state and action space dimensions.

Multi-Task Learning reinforcement-learning +1

DemoDICE: Offline Imitation Learning with Supplementary Imperfect Demonstrations

no code implementations ICLR 2022 Geon-Hyeong Kim, Seokin Seo, Jongmin Lee, Wonseok Jeon, HyeongJoo Hwang, Hongseok Yang, Kee-Eung Kim

We consider offline imitation learning (IL), which aims to mimic the expert's behavior from its demonstration without further interaction with the environment.

Imitation Learning

Dual Correction Strategy for Ranking Distillation in Top-N Recommender System

1 code implementation8 Sep 2021 Youngjune Lee, Kee-Eung Kim

Knowledge Distillation (KD), which transfers the knowledge of a well-trained large model (teacher) to a small model (student), has become an important area of research for practical deployment of recommender systems.

Knowledge Distillation Recommendation Systems

OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation

1 code implementation21 Jun 2021 Jongmin Lee, Wonseok Jeon, Byung-Jun Lee, Joelle Pineau, Kee-Eung Kim

We consider the offline reinforcement learning (RL) setting where the agent aims to optimize the policy solely from the data without further environment interactions.

Offline RL Reinforcement Learning (RL)

Monte-Carlo Planning and Learning with Language Action Value Estimates

no code implementations ICLR 2021 Youngsoo Jang, Seokin Seo, Jongmin Lee, Kee-Eung Kim

Interactive Fiction (IF) games provide a useful testbed for language-based reinforcement learning agents, posing significant challenges of natural language understanding, commonsense reasoning, and non-myopic planning in the combinatorial search space.

Natural Language Understanding reinforcement-learning +1

Representation Balancing Offline Model-based Reinforcement Learning

no code implementations ICLR 2021 Byung-Jun Lee, Jongmin Lee, Kee-Eung Kim

We present a new objective for model learning motivated by recent advances in the estimation of stationary distribution corrections.

Model-based Reinforcement Learning Offline RL +2

Variational Interaction Information Maximization for Cross-domain Disentanglement

2 code implementations NeurIPS 2020 HyeongJoo Hwang, Geon-Hyeong Kim, Seunghoon Hong, Kee-Eung Kim

Grounded in information theory, we cast the simultaneous learning of domain-invariant and domain-specific representations as a joint objective of multiple information constraints, which does not require adversarial training or gradient reversal layers.

Disentanglement Image-to-Image Translation +3

Reinforcement Learning for Control with Multiple Frequencies

no code implementations NeurIPS 2020 Jongmin Lee, ByungJun Lee, Kee-Eung Kim

Many real-world sequential decision problems involve multiple action variables whose control frequencies are different, such that actions take their effects at different periods.

Continuous Control reinforcement-learning +1

End-to-End Neural Pipeline for Goal-Oriented Dialogue Systems using GPT-2

no code implementations ACL 2020 Donghoon Ham, Jeong-Gwan Lee, Youngsoo Jang, Kee-Eung Kim

The goal-oriented dialogue system needs to be optimized for tracking the dialogue flow and carrying out an effective conversation under various situations to meet the user goal.

Goal-Oriented Dialogue Systems

Policy Optimization Through Approximate Importance Sampling

1 code implementation9 Oct 2019 Marcin B. Tomczak, Dongho Kim, Peter Vrancx, Kee-Eung Kim

These proxy objectives allow stable and low variance policy learning, but require small policy updates to ensure that the proxy objective remains an accurate approximation of the target policy value.

Continuous Control

Monte-Carlo Tree Search for Constrained POMDPs

no code implementations NeurIPS 2018 Jongmin Lee, Geon-Hyeong Kim, Pascal Poupart, Kee-Eung Kim

In this paper, we present CC-POMCP (Cost-Constrained POMCP), an online MCTS algorithm for large CPOMDPs that leverages the optimization of LP-induced parameters and only requires a black-box simulator of the environment.

Decision Making

A Bayesian Approach to Generative Adversarial Imitation Learning

no code implementations NeurIPS 2018 Wonseok Jeon, Seokin Seo, Kee-Eung Kim

Generative adversarial training for imitation learning has shown promising results on high-dimensional and continuous control tasks.

Continuous Control Imitation Learning

Information-Theoretic Bounded Rationality

no code implementations21 Dec 2015 Pedro A. Ortega, Daniel A. Braun, Justin Dyer, Kee-Eung Kim, Naftali Tishby

Bounded rationality, that is, decision-making and planning under resource limitations, is widely regarded as an important open problem in artificial intelligence, reinforcement learning, computational neuroscience and economics.

Decision Making

Cost-Sensitive Exploration in Bayesian Reinforcement Learning

no code implementations NeurIPS 2012 Dongho Kim, Kee-Eung Kim, Pascal Poupart

In this paper, we consider Bayesian reinforcement learning (BRL) where actions incur costs in addition to rewards, and thus exploration has to be constrained in terms of the expected total cost while learning to maximize the expected long-term total reward.

reinforcement-learning Reinforcement Learning (RL)

MAP Inference for Bayesian Inverse Reinforcement Learning

no code implementations NeurIPS 2011 Jaedeug Choi, Kee-Eung Kim

The difficulty in inverse reinforcement learning (IRL) arises in choosing the best reward function since there are typically an infinite number of reward functions that yield the given behaviour data as optimal.

reinforcement-learning Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.