no code implementations • ICML 2020 • Geon-Hyeong Kim, Youngsoo Jang, Hongseok Yang, Kee-Eung Kim
The estimated future likelihoods form the core of our new low-variance gradient estimator.
no code implementations • 21 Mar 2024 • Kyungjae Lee, Dasol Hwang, Sunghyun Park, Youngsoo Jang, Moontae Lee
Despite the promise of RLHF in aligning LLMs with human preferences, it often leads to superficial alignment, prioritizing stylistic changes over improving downstream performance of LLMs.
2 code implementations • 28 Feb 2022 • Geon-Hyeong Kim, Jongmin Lee, Youngsoo Jang, Hongseok Yang, Kee-Eung Kim
We consider the problem of learning from observation (LfO), in which the agent aims to mimic the expert's behavior from the state-only demonstrations by experts.
no code implementations • ICLR 2022 • Youngsoo Jang, Jongmin Lee, Kee-Eung Kim
GPT-Critic is essentially free from the issue of diverging from human language since it learns from the sentences sampled from the pre-trained language model.
no code implementations • ICLR 2021 • Youngsoo Jang, Seokin Seo, Jongmin Lee, Kee-Eung Kim
Interactive Fiction (IF) games provide a useful testbed for language-based reinforcement learning agents, posing significant challenges of natural language understanding, commonsense reasoning, and non-myopic planning in the combinatorial search space.
no code implementations • ACL 2020 • Donghoon Ham, Jeong-Gwan Lee, Youngsoo Jang, Kee-Eung Kim
The goal-oriented dialogue system needs to be optimized for tracking the dialogue flow and carrying out an effective conversation under various situations to meet the user goal.
no code implementations • IJCNLP 2019 • Youngsoo Jang, Jongmin Lee, Jaeyoung Park, Kyeng-Hun Lee, Pierre Lison, Kee-Eung Kim
We present PyOpenDial, a Python-based domain-independent, open-source toolkit for spoken dialogue systems.