Multi-Critic Actor Learning: Teaching RL Policies to Act with Style

ICLR 2022 · Siddharth Mysore, George Cheng, Yunqi Zhao, Kate Saenko, Meng Wu ·

Using a single value function (critic) shared over multiple tasks in Actor-Critic multi-task reinforcement learning (MTRL) can result in negative interference between tasks, which can compromise learning performance. Multi-Critic Actor Learning (MultiCriticAL) proposes instead maintaining separate value-function estimators, i.e. critics, for each task being trained. This relaxes an assumption of continuity between task values and avoids interference between task-value estimates. Explicitly distinguishing between tasks also eliminates the need for critics to learn to do so. MultiCriticAL is tested in the context of multi-style learning, a special case of MTRL where agents are trained to behave with different distinct behavior styles, and yields up to 45% performance gains over the single-critic baselines and even successfully learns behavior styles in cases where single-critic approaches may simply fail to learn. As a further test of MultiCriticAL’s utility, it is tested on a simulation of EA’s UFC game, where our method enables a single policy function to learn and smoothly transition between multiple fighting styles.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Datasets

Add Datasets introduced or used in this paper

Results from the Paper

Add Remove

Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods

Add Remove

Test

Edit Social Preview

Multi-Critic Actor Learning: Teaching RL Policies to Act with Style

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove