no code implementations • ICML 2020 • Geon-Hyeong Kim, Youngsoo Jang, Hongseok Yang, Kee-Eung Kim
The estimated future likelihoods form the core of our new low-variance gradient estimator.
no code implementations • Findings (NAACL) 2022 • Haeju Lee, Oh Joon Kwon, Yunseon Choi, Minho Park, Ran Han, Yoonhyung Kim, Jinhyeon Kim, Youngjune Lee, Haebin Shin, Kangwook Lee, Kee-Eung Kim
The Situated Interactive Multi-Modal Conversations (SIMMC) 2. 0 aims to create virtual shopping assistants that can accept complex multi-modal inputs, i. e. visual appearances of objects and user utterances.
Ranked #2 on
Response Generation
on SIMMC2.0
no code implementations • ICML 2020 • Byung-Jun Lee, Jongmin Lee, Peter Vrancx, Dongho Kim, Kee-Eung Kim
We consider the batch reinforcement learning problem where the agent needs to learn only from a fixed batch of data, without further interaction with the environment.
1 code implementation • 5 Dec 2024 • Jungwoo Park, Young Jin Ahn, Kee-Eung Kim, Jaewoo Kang
Understanding the internal computations of large language models (LLMs) is crucial for aligning them with human values and preventing undesirable behaviors like toxic content generation.
no code implementations • 19 Oct 2024 • Oh Joon Kwon, Daiki E. Matsunaga, Kee-Eung Kim
A critical component of the current generation of language models is preference alignment, which aims to precisely control the model's behavior to meet human needs and values.
no code implementations • 28 Sep 2024 • Seongmin Lee, Jaewook Shin, Youngjin Ahn, Seokin Seo, Ohjoon Kwon, Kee-Eung Kim
Recent advances in large language models (LLMs) have significantly impacted the domain of multi-hop question answering (MHQA), where systems are required to aggregate information and infer answers from disparate pieces of text.
1 code implementation • 20 Jul 2024 • Yunseon Choi, Sangmin Bae, Seonghyun Ban, Minchan Jeong, Chuheng Zhang, Lei Song, Li Zhao, Jiang Bian, Kee-Eung Kim
With the advent of foundation models, prompt tuning has positioned itself as an important technique for directing model behaviors and eliciting desired responses.
1 code implementation • 18 Jun 2024 • Young Jin Ahn, Jungwoo Park, Sangha Park, Jonghyun Choi, Kee-Eung Kim
Visual Speech Recognition (VSR) stands at the intersection of computer vision and speech recognition, aiming to interpret spoken content from visual cues.
Ranked #1 on
Landmark-based Lipreading
on LRS2
1 code implementation • 29 May 2024 • Haanvid Lee, Tri Wahyu Guntara, Jongmin Lee, Yung-Kyun Noh, Kee-Eung Kim
To address this limitation, we propose to relax the deterministic target policy using a kernel and learn the kernel metrics that minimize the overall mean squared error of the estimated temporal difference update vector of an action value function, where the action value function is used for policy evaluation.
1 code implementation • 13 Feb 2024 • Haeju Lee, Minchan Jeong, Se-Young Yun, Kee-Eung Kim
We argue that when we extract knowledge from source tasks via training source prompts, we need to consider this correlation among source tasks for better transfer to target tasks.
1 code implementation • 11 Feb 2024 • Sungyoon Kim, Yunseon Choi, Daiki E. Matsunaga, Kee-Eung Kim
In this paper, we propose SSD (Sub-trajectory Stitching with Diffusion), a model-based offline GCRL method that leverages the conditional diffusion model to address these limitations.
1 code implementation • NeurIPS 2023 • Daiki E. Matsunaga, Jongmin Lee, Jaeseok Yoon, Stefanos Leonardos, Pieter Abbeel, Kee-Eung Kim
To this end, we introduce AlberDICE, an offline MARL algorithm that alternatively performs centralized training of individual agents based on stationary distribution optimization.
no code implementations • 29 Aug 2023 • Jaeseok Yoon, Seunghyun Hwang, Ran Han, Jeonguk Bang, Kee-Eung Kim
Although there have been remarkable advances in dialogue systems through the dialogue systems technology competition (DSTC), it remains one of the key challenges to building a robust task-oriented dialogue system with a speech interface.
1 code implementation • 24 Oct 2022 • Haanvid Lee, Jongmin Lee, Yunseon Choi, Wonseok Jeon, Byung-Jun Lee, Yung-Kyun Noh, Kee-Eung Kim
We consider local kernel metric learning for off-policy evaluation (OPE) of deterministic policies in contextual bandits with continuous action spaces.
no code implementations • 12 Jun 2022 • Sanghoon Myung, In Huh, Wonik Jang, Jae Myung Choe, Jisu Ryu, Dae Sin Kim, Kee-Eung Kim, Changwook Jeong
Inductive transfer learning aims to learn from a small amount of training data for the target task by utilizing a pre-trained model from the source task.
1 code implementation • ICLR 2022 • Jongmin Lee, Cosmin Paduraru, Daniel J. Mankowitz, Nicolas Heess, Doina Precup, Kee-Eung Kim, Arthur Guez
We consider the offline constrained reinforcement learning (RL) problem, in which the agent aims to compute a policy that maximizes expected return while satisfying given cost constraints, learning only from a pre-collected dataset.
2 code implementations • 28 Feb 2022 • Geon-Hyeong Kim, Jongmin Lee, Youngsoo Jang, Hongseok Yang, Kee-Eung Kim
We consider the problem of learning from observation (LfO), in which the agent aims to mimic the expert's behavior from the state-only demonstrations by experts.
no code implementations • 7 Dec 2021 • Youngjune Lee, Oh Joon Kwon, Haeju Lee, Joonyoung Kim, Kangwook Lee, Kee-Eung Kim
For this reason, data-centric approaches are crucial for the automation of machine learning operation pipeline.
no code implementations • NeurIPS 2021 • HyeongJoo Hwang, Geon-Hyeong Kim, Seunghoon Hong, Kee-Eung Kim
Multi-View Representation Learning (MVRL) aims to discover a shared representation of observations from different views with the complex underlying correlation.
no code implementations • ICLR 2022 • Geon-Hyeong Kim, Seokin Seo, Jongmin Lee, Wonseok Jeon, HyeongJoo Hwang, Hongseok Yang, Kee-Eung Kim
We consider offline imitation learning (IL), which aims to mimic the expert's behavior from its demonstration without further interaction with the environment.
no code implementations • ICLR 2022 • Sunghoon Hong, Deunsol Yoon, Kee-Eung Kim
We empirically show that the morphological information is crucial for modular reinforcement learning, substantially outperforming prior state-of-the-art methods on multi-task learning as well as transfer learning settings with different state and action space dimensions.
no code implementations • ICLR 2022 • Youngsoo Jang, Jongmin Lee, Kee-Eung Kim
GPT-Critic is essentially free from the issue of diverging from human language since it learns from the sentences sampled from the pre-trained language model.
1 code implementation • 8 Sep 2021 • Youngjune Lee, Kee-Eung Kim
Knowledge Distillation (KD), which transfers the knowledge of a well-trained large model (teacher) to a small model (student), has become an important area of research for practical deployment of recommender systems.
1 code implementation • 21 Jun 2021 • Jongmin Lee, Wonseok Jeon, Byung-Jun Lee, Joelle Pineau, Kee-Eung Kim
We consider the offline reinforcement learning (RL) setting where the agent aims to optimize the policy solely from the data without further environment interactions.
no code implementations • ICLR 2021 • Byung-Jun Lee, Jongmin Lee, Kee-Eung Kim
We present a new objective for model learning motivated by recent advances in the estimation of stationary distribution corrections.
no code implementations • ICLR 2021 • Deunsol Yoon, Sunghoon Hong, Byung-Jun Lee, Kee-Eung Kim
Safe and reliable electricity transmission in power grids is crucial for modern society.
no code implementations • ICLR 2021 • Youngsoo Jang, Seokin Seo, Jongmin Lee, Kee-Eung Kim
Interactive Fiction (IF) games provide a useful testbed for language-based reinforcement learning agents, posing significant challenges of natural language understanding, commonsense reasoning, and non-myopic planning in the combinatorial search space.
2 code implementations • NeurIPS 2020 • HyeongJoo Hwang, Geon-Hyeong Kim, Seunghoon Hong, Kee-Eung Kim
Grounded in information theory, we cast the simultaneous learning of domain-invariant and domain-specific representations as a joint objective of multiple information constraints, which does not require adversarial training or gradient reversal layers.
no code implementations • NeurIPS 2020 • Jongmin Lee, ByungJun Lee, Kee-Eung Kim
Many real-world sequential decision problems involve multiple action variables whose control frequencies are different, such that actions take their effects at different periods.
no code implementations • ACL 2020 • Donghoon Ham, Jeong-Gwan Lee, Youngsoo Jang, Kee-Eung Kim
The goal-oriented dialogue system needs to be optimized for tracking the dialogue flow and carrying out an effective conversation under various situations to meet the user goal.
no code implementations • IJCNLP 2019 • Youngsoo Jang, Jongmin Lee, Jaeyoung Park, Kyeng-Hun Lee, Pierre Lison, Kee-Eung Kim
We present PyOpenDial, a Python-based domain-independent, open-source toolkit for spoken dialogue systems.
1 code implementation • 9 Oct 2019 • Marcin B. Tomczak, Dongho Kim, Peter Vrancx, Kee-Eung Kim
These proxy objectives allow stable and low variance policy learning, but require small policy updates to ensure that the proxy objective remains an accurate approximation of the target policy value.
no code implementations • NeurIPS 2018 • Wonseok Jeon, Seokin Seo, Kee-Eung Kim
Generative adversarial training for imitation learning has shown promising results on high-dimensional and continuous control tasks.
no code implementations • NeurIPS 2018 • Jongmin Lee, Geon-Hyeong Kim, Pascal Poupart, Kee-Eung Kim
In this paper, we present CC-POMCP (Cost-Constrained POMCP), an online MCTS algorithm for large CPOMDPs that leverages the optimization of LP-induced parameters and only requires a black-box simulator of the environment.
no code implementations • NeurIPS 2017 • Yung-Kyun Noh, Masashi Sugiyama, Kee-Eung Kim, Frank Park, Daniel D. Lee
This paper shows how metric learning can be used with Nadaraya-Watson (NW) kernel regression.
no code implementations • 21 Dec 2015 • Pedro A. Ortega, Daniel A. Braun, Justin Dyer, Kee-Eung Kim, Naftali Tishby
Bounded rationality, that is, decision-making and planning under resource limitations, is widely regarded as an important open problem in artificial intelligence, reinforcement learning, computational neuroscience and economics.
no code implementations • 7 Aug 2014 • Leonid Peshkin, Kee-Eung Kim, Nicolas Meuleau, Leslie Pack Kaelbling
Cooperative games are those in which both agents share the same payoff structure.
no code implementations • NeurIPS 2012 • Jaedeug Choi, Kee-Eung Kim
We present a nonparametric Bayesian approach to inverse reinforcement learning (IRL) for multiple reward functions.
no code implementations • NeurIPS 2012 • Dongho Kim, Kee-Eung Kim, Pascal Poupart
In this paper, we consider Bayesian reinforcement learning (BRL) where actions incur costs in addition to rewards, and thus exploration has to be constrained in terms of the expected total cost while learning to maximize the expected long-term total reward.
no code implementations • NeurIPS 2011 • Jaedeug Choi, Kee-Eung Kim
The difficulty in inverse reinforcement learning (IRL) arises in choosing the best reward function since there are typically an infinite number of reward functions that yield the given behaviour data as optimal.