Search Results for author: Youngsoo Jang

Found 7 papers, 1 papers with code

Reinforcement Learning from Reflective Feedback (RLRF): Aligning and Improving LLMs via Fine-Grained Self-Reflection

no code implementations21 Mar 2024 Kyungjae Lee, Dasol Hwang, Sunghyun Park, Youngsoo Jang, Moontae Lee

Despite the promise of RLHF in aligning LLMs with human preferences, it often leads to superficial alignment, prioritizing stylistic changes over improving downstream performance of LLMs.

Mathematical Reasoning

LobsDICE: Offline Learning from Observation via Stationary Distribution Correction Estimation

2 code implementations28 Feb 2022 Geon-Hyeong Kim, Jongmin Lee, Youngsoo Jang, Hongseok Yang, Kee-Eung Kim

We consider the problem of learning from observation (LfO), in which the agent aims to mimic the expert's behavior from the state-only demonstrations by experts.

Imitation Learning

Offline Reinforcement Learning for Large Scale Language Action Spaces

no code implementations ICLR 2022 Youngsoo Jang, Jongmin Lee, Kee-Eung Kim

GPT-Critic is essentially free from the issue of diverging from human language since it learns from the sentences sampled from the pre-trained language model.

Language Modelling Offline RL +2

Monte-Carlo Planning and Learning with Language Action Value Estimates

no code implementations ICLR 2021 Youngsoo Jang, Seokin Seo, Jongmin Lee, Kee-Eung Kim

Interactive Fiction (IF) games provide a useful testbed for language-based reinforcement learning agents, posing significant challenges of natural language understanding, commonsense reasoning, and non-myopic planning in the combinatorial search space.

Natural Language Understanding reinforcement-learning +1

End-to-End Neural Pipeline for Goal-Oriented Dialogue Systems using GPT-2

no code implementations ACL 2020 Donghoon Ham, Jeong-Gwan Lee, Youngsoo Jang, Kee-Eung Kim

The goal-oriented dialogue system needs to be optimized for tracking the dialogue flow and carrying out an effective conversation under various situations to meet the user goal.

Goal-Oriented Dialogue Systems

Cannot find the paper you are looking for? You can Submit a new open access paper.