1 code implementation • NeurIPS 2023 • Revan MacQueen, James R. Wright
Self-play is a technique for machine learning in multi-agent systems where a learning algorithm learns by interacting with copies of itself.
no code implementations • 7 Jun 2023 • Greg d'Eon, Sophie Greenwood, Kevin Leyton-Brown, James R. Wright
Researchers building behavioral models, such as behavioral game theorists, use experimental data to evaluate predictive models of human behavior.
1 code implementation • 24 May 2022 • Dustin Morrill, Ryan D'Orazio, Marc Lanctot, James R. Wright, Michael Bowling, Amy R. Greenwald
Hindsight rationality is an approach to playing general-sum games that prescribes no-regret learning dynamics for individual agents with respect to a set of deviations, and further describes jointly rational behavior among multiple agents with mediated equilibria.
no code implementations • 15 Nov 2021 • Vincent Liu, James R. Wright, Martha White
Offline reinforcement learning -- learning a policy from a batch of data -- is known to be hard for general MDPs.
2 code implementations • 1 Jul 2021 • Greg d'Eon, Jason d'Eon, James R. Wright, Kevin Leyton-Brown
Supervised learning models often make systematic errors on rare subsets of the data.
no code implementations • 17 Jun 2021 • Shehroze Khan, James R. Wright
The spread of disinformation on social platforms is harmful to society.
1 code implementation • 13 Feb 2021 • Dustin Morrill, Ryan D'Orazio, Marc Lanctot, James R. Wright, Michael Bowling, Amy Greenwald
Hindsight rationality is an approach to playing general-sum games that prescribes no-regret learning dynamics for individual agents with respect to a set of deviations, and further describes jointly rational behavior among multiple agents with mediated equilibria.
1 code implementation • 10 Dec 2020 • Dustin Morrill, Ryan D'Orazio, Reca Sarfati, Marc Lanctot, James R. Wright, Amy Greenwald, Michael Bowling
This approach also leads to a game-theoretic analysis, but in the correlated play that arises from joint learning dynamics rather than factored agent behavior at equilibrium.
no code implementations • 6 Dec 2019 • Ryan D'Orazio, Dustin Morrill, James R. Wright, Michael Bowling
In contrast, the more conventional softmax parameterization is standard in the field of reinforcement learning and yields a regret bound with a better dependence on the number of actions.
no code implementations • 3 Oct 2019 • Ryan D'Orazio, Dustin Morrill, James R. Wright
A common approach to incorporating function approximation is to learn the inputs needed for a regret minimizing algorithm, allowing for generalization across many regret minimization problems.
no code implementations • NeurIPS 2016 • Jason S. Hartford, James R. Wright, Kevin Leyton-Brown
Predicting the behavior of human participants in strategic settings is an important problem in many domains.