no code implementations • 16 Apr 2024 • Caroline Wang, Arrasy Rahman, Ishan Durugkar, Elad Liebman, Peter Stone
POAM is a policy gradient, multi-agent reinforcement learning approach to the NAHT problem, that enables adaptation to diverse teammate behaviors by learning representations of teammate behaviors.
no code implementations • 23 Jan 2024 • Zizhao Wang, Caroline Wang, Xuesu Xiao, Yuke Zhu, Peter Stone
Two desiderata of reinforcement learning (RL) algorithms are the ability to learn from relatively little experience and the ability to learn policies that generalize to a range of problem specifications.
no code implementations • 26 Oct 2022 • Caroline Wang, Garrett Warnell, Peter Stone
While combining imitation learning (IL) and reinforcement learning (RL) is a promising way to address poor sample efficiency in autonomous behavior acquisition, methods that do so typically assume that the requisite behavior demonstrations are provided by an expert that behaves optimally with respect to a task reward.
1 code implementation • 1 Jun 2022 • Caroline Wang, Ishan Durugkar, Elad Liebman, Peter Stone
The theoretical analysis shows that under certain conditions, each agent minimizing its individual distribution mismatch allows the convergence to the joint policy that generated the target distribution.
Multi-agent Reinforcement Learning reinforcement-learning +2
1 code implementation • 8 May 2020 • Caroline Wang, Bin Han, Bhrij Patel, Cynthia Rudin
We compared predictive performance and fairness of these models against two methods that are currently used in the justice system to predict pretrial recidivism: the Arnold PSA and COMPAS.