no code implementations • 26 Oct 2023 • Yoshua Bengio, Geoffrey Hinton, Andrew Yao, Dawn Song, Pieter Abbeel, Trevor Darrell, Yuval Noah Harari, Ya-Qin Zhang, Lan Xue, Shai Shalev-Shwartz, Gillian Hadfield, Jeff Clune, Tegan Maharaj, Frank Hutter, Atılım Güneş Baydin, Sheila Mcilraith, Qiqi Gao, Ashwin Acharya, David Krueger, Anca Dragan, Philip Torr, Stuart Russell, Daniel Kahneman, Jan Brauner, Sören Mindermann
Artificial Intelligence (AI) is progressing rapidly, and companies are shifting their focus to developing generalist AI systems that can autonomously act and pursue goals.
2 code implementations • NeurIPS 2023 • Shalev Lifshitz, Keiran Paster, Harris Chan, Jimmy Ba, Sheila Mcilraith
Constructing AI models that respond to text instructions is challenging, especially for sequential decision-making tasks.
no code implementations • 30 Jan 2023 • Pouya Shati, Eldan Cohen, Sheila Mcilraith
In this work, we present a novel SAT-based framework for interpretable clustering that supports clustering constraints and that also provides strong theoretical guarantees on solution quality.
no code implementations • 31 May 2022 • Keiran Paster, Sheila Mcilraith, Jimmy Ba
In all tested domains, ESPER achieves significantly better alignment between the target return and achieved return than simply conditioning on returns.
no code implementations • 6 Oct 2021 • Christian Muise, Vaishak Belle, Paolo Felli, Sheila Mcilraith, Tim Miller, Adrian R. Pearce, Liz Sonenberg
Many AI applications involve the interaction of multiple autonomous agents, requiring those agents to reason about their own beliefs, as well as those of other agents.
1 code implementation • 13 Feb 2021 • Pashootan Vaezipoor, Andrew Li, Rodrigo Toro Icarte, Sheila Mcilraith
We address the problem of teaching a deep reinforcement learning (RL) agent to follow instructions in multi-task environments.
1 code implementation • NeurIPS 2019 • Rodrigo Toro Icarte, Ethan Waldie, Toryn Klassen, Rick Valenzano, Margarita Castro, Sheila Mcilraith
Reward Machines (RMs), originally proposed for specifying problems in Reinforcement Learning (RL), provide a structured, automata-based representation of a reward function that allows an agent to decompose problems into subproblems that can be efficiently learned using off-policy learning.
Partially Observable Reinforcement Learning
Problem Decomposition
+3
1 code implementation • ICML 2018 • Rodrigo Toro Icarte, Toryn Klassen, Richard Valenzano, Sheila Mcilraith
In this paper we propose Reward Machines {—} a type of finite state machine that supports the specification of reward functions while exposing reward function structure to the learner and supporting decomposition.