1 code implementation • 31 May 2024 • Andrew C. Li, Zizhao Chen, Toryn Q. Klassen, Pashootan Vaezipoor, Rodrigo Toro Icarte, Sheila A. McIlraith
While Reward Machines have been employed in both tabular and deep RL settings, they have typically relied on a ground-truth interpretation of the domain-specific vocabulary that form the building blocks of the reward function.
1 code implementation • 8 Dec 2023 • Parand A. Alamdari, Toryn Q. Klassen, Elliot Creager, Sheila A. McIlraith
Fair decision making has largely been studied with respect to a single decision.
no code implementations • 20 Nov 2022 • Andrew C. Li, Zizhao Chen, Pashootan Vaezipoor, Toryn Q. Klassen, Rodrigo Toro Icarte, Sheila A. McIlraith
Natural and formal languages provide an effective mechanism for humans to specify instructions and reward functions.
1 code implementation • 8 Nov 2022 • Mathieu Tuli, Andrew C. Li, Pashootan Vaezipoor, Toryn Q. Klassen, Scott Sanner, Sheila A. McIlraith
Text-based games present a unique class of sequential decision making problem in which agents interact with a partially observable, simulated environment via actions and observations conveyed through natural language.
no code implementations • 17 Dec 2021 • Rodrigo Toro Icarte, Ethan Waldie, Toryn Q. Klassen, Richard Valenzano, Margarita P. Castro, Sheila A. McIlraith
Here we show that RMs can be learned from experience, instead of being specified by the user, and that the resulting problem decomposition can be used to effectively solve partially observable RL problems.
Partially Observable Reinforcement Learning Problem Decomposition +3
no code implementations • 4 Jun 2021 • Parand Alizadeh Alamdari, Toryn Q. Klassen, Rodrigo Toro Icarte, Sheila A. McIlraith
We endow RL agents with the ability to contemplate such impact by augmenting their reward based on expectation of future return by others in the environment, providing different criteria for characterizing impact.
3 code implementations • 6 Oct 2020 • Rodrigo Toro Icarte, Toryn Q. Klassen, Richard Valenzano, Sheila A. McIlraith
First, we propose reward machines, a type of finite state machine that supports the specification of reward functions while exposing reward function structure.
no code implementations • 5 Oct 2020 • Rodrigo Toro Icarte, Richard Valenzano, Toryn Q. Klassen, Phillip Christoffersen, Amir-Massoud Farahmand, Sheila A. McIlraith
Learning memoryless policies is efficient and optimal in fully observable environments.
Partially Observable Reinforcement Learning reinforcement-learning +1
no code implementations • 6 May 2020 • Maayan Shvo, Toryn Q. Klassen, Sheila A. McIlraith
Theory of Mind is commonly defined as the ability to attribute mental states (e. g., beliefs, goals) to oneself, and to others.