Search Results for author: Caroline Wang

Found 5 papers, 2 papers with code

N-Agent Ad Hoc Teamwork

no code implementations16 Apr 2024 Caroline Wang, Arrasy Rahman, Ishan Durugkar, Elad Liebman, Peter Stone

POAM is a policy gradient, multi-agent reinforcement learning approach to the NAHT problem, that enables adaptation to diverse teammate behaviors by learning representations of teammate behaviors.

Autonomous Driving Multi-agent Reinforcement Learning +4

Building Minimal and Reusable Causal State Abstractions for Reinforcement Learning

no code implementations23 Jan 2024 Zizhao Wang, Caroline Wang, Xuesu Xiao, Yuke Zhu, Peter Stone

Two desiderata of reinforcement learning (RL) algorithms are the ability to learn from relatively little experience and the ability to learn policies that generalize to a range of problem specifications.

reinforcement-learning Reinforcement Learning (RL)

D-Shape: Demonstration-Shaped Reinforcement Learning via Goal Conditioning

no code implementations26 Oct 2022 Caroline Wang, Garrett Warnell, Peter Stone

While combining imitation learning (IL) and reinforcement learning (RL) is a promising way to address poor sample efficiency in autonomous behavior acquisition, methods that do so typically assume that the requisite behavior demonstrations are provided by an expert that behaves optimally with respect to a task reward.

Imitation Learning reinforcement-learning +1

DM$^2$: Decentralized Multi-Agent Reinforcement Learning for Distribution Matching

1 code implementation1 Jun 2022 Caroline Wang, Ishan Durugkar, Elad Liebman, Peter Stone

The theoretical analysis shows that under certain conditions, each agent minimizing its individual distribution mismatch allows the convergence to the joint policy that generated the target distribution.

Multi-agent Reinforcement Learning reinforcement-learning +2

In Pursuit of Interpretable, Fair and Accurate Machine Learning for Criminal Recidivism Prediction

1 code implementation8 May 2020 Caroline Wang, Bin Han, Bhrij Patel, Cynthia Rudin

We compared predictive performance and fairness of these models against two methods that are currently used in the justice system to predict pretrial recidivism: the Arnold PSA and COMPAS.

BIG-bench Machine Learning Fairness +1

Cannot find the paper you are looking for? You can Submit a new open access paper.