no code implementations • 3 Feb 2024 • Jeonghye Kim, Suyoung Lee, Woojun Kim, Youngchul Sung
Offline reinforcement learning (RL) has seen notable advancements through return-conditioned supervised learning (RCSL) and value-based methods, yet each approach comes with its own set of practical challenges.
no code implementations • NeurIPS 2023 • Sungho Choi, Seungyul Han, Woojun Kim, Jongseong Chae, Whiyoung Jung, Youngchul Sung
In this paper, we consider domain-adaptive imitation learning with visual observation, where an agent in a target domain learns to perform a task by observing expert demonstrations in a source domain.
1 code implementation • 5 Oct 2023 • Woojun Kim, Jeonghye Kim, Youngchul Sung
In this paper, a unified framework for exploration in reinforcement learning (RL) is proposed based on an option-critic model.
no code implementations • 4 Oct 2023 • Jeonghye Kim, Suyoung Lee, Woojun Kim, Youngchul Sung
However, we discovered that the attention module of DT is not appropriate to capture the inherent local dependence pattern in trajectories of RL modeled as a Markov decision process.
no code implementations • 2 Mar 2023 • Woojun Kim, Youngchul Sung
Handling the problem of scalability is one of the essential issues for multi-agent reinforcement learning (MARL) algorithms to be applied to real-world problems typically involving massively many agents.
no code implementations • 1 Mar 2023 • Woojun Kim, Whiyoung Jung, Myungsik Cho, Youngchul Sung
In this paper, we propose a new mutual information framework for multi-agent reinforcement learning to enable multiple agents to learn coordinated behaviors by regularizing the accumulated return with the simultaneous mutual information between multi-agent actions.
Multi-agent Reinforcement Learning reinforcement-learning +1
1 code implementation • 28 Nov 2022 • Whiyoung Jung, Myungsik Cho, Jongeui Park, Youngchul Sung
This paper proposes a framework, named Quantile Constrained RL (QCRL), to constrain the quantile of the distribution of the cumulative sum cost that is a necessary and sufficient condition to satisfy the outage constraint.
1 code implementation • 20 Jun 2022 • Jeewon Jeon, Woojun Kim, Whiyoung Jung, Youngchul Sung
In this paper, we consider cooperative multi-agent reinforcement learning (MARL) with sparse reward.
1 code implementation • 19 Jun 2022 • Jongseong Chae, Seungyul Han, Whiyoung Jung, Myungsik Cho, Sungho Choi, Youngchul Sung
In this paper, we propose a robust imitation learning (IL) framework that improves the robustness of IL when environment dynamics are perturbed.
1 code implementation • 10 Dec 2021 • Giseung Park, Sungho Choi, Youngchul Sung
This paper proposes a new sequential model learning architecture to solve partially observable Markov decision problems.
Partially Observable Reinforcement Learning reinforcement-learning +1
1 code implementation • NeurIPS 2021 • Seungyul Han, Youngchul Sung
In this paper, we propose a max-min entropy framework for reinforcement learning (RL) to overcome the limitation of the soft actor-critic (SAC) algorithm implementing the maximum entropy RL in model-free sample-based learning.
no code implementations • 1 Jan 2021 • Giseung Park, Whiyoung Jung, Sungho Choi, Youngchul Sung
In this paper, we consider intrinsic reward generation for sparse-reward reinforcement learning based on model prediction errors.
no code implementations • ICLR 2021 • Woojun Kim, Jongeui Park, Youngchul Sung
Communication is one of the core components for learning coordinated behavior in multi-agent systems.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 14 Dec 2020 • Sohee Bae, Seungyul Han, Youngchul Sung
A condition for the reward function of reinforcement learning (RL) for queue stability is derived.
no code implementations • 4 Jun 2020 • Woojun Kim, Whiyoung Jung, Myungsik Cho, Youngchul Sung
In this paper, we propose a maximum mutual information (MMI) framework for multi-agent reinforcement learning (MARL) to enable multiple agents to learn coordinated behaviors by regularizing the accumulated return with the mutual information between actions.
Multiagent Systems
1 code implementation • 2 Jun 2020 • Seungyul Han, Youngchul Sung
In this paper, sample-aware policy entropy regularization is proposed to enhance the conventional policy entropy regularization for better exploration.
no code implementations • 2 Jun 2020 • Sungho Choi, Seungyul Han, Woojun Kim, Youngchul Sung
In this paper, we consider cross-domain imitation learning (CDIL) in which an agent in a target domain learns a policy to perform well in the target domain by observing expert demonstrations in a source domain without accessing any reward function.
1 code implementation • ICLR 2020 • Whiyoung Jung, Giseung Park, Youngchul Sung
In the proposed scheme, multiple identical learners with their own value-functions and policies share a common experience replay buffer, and search a good policy in collaboration with the guidance of the best policy information.
no code implementations • 25 Sep 2019 • Giseung Park, Whiyoung Jung, Sungho Choi, Youngchul Sung
In this paper, a new intrinsic reward generation method for sparse-reward reinforcement learning is proposed based on an ensemble of dynamics models.
1 code implementation • 7 May 2019 • Seungyul Han, Youngchul Sung
In importance sampling (IS)-based reinforcement learning algorithms such as Proximal Policy Optimization (PPO), IS weights are typically clipped to avoid large variance in learning.
no code implementations • 18 Feb 2019 • Woojun Kim, Myungsik Cho, Youngchul Sung
In this paper, we propose a new learning technique named message-dropout to improve the performance for multi-agent deep reinforcement learning under two application scenarios: 1) classical multi-agent reinforcement learning with direct message communication among agents and 2) centralized training with decentralized execution.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 27 Sep 2018 • Whiyoung Jung, Giseung Park, Youngchul Sung
In this paper, a new interactive parallel learning scheme is proposed to enhance the performance of off-policy continuous-action reinforcement learning.
no code implementations • 12 Oct 2017 • Seungyul Han, Youngchul Sung
In this paper, a new adaptive multi-batch experience replay scheme is proposed for proximal policy optimization (PPO) for continuous action control.