no code implementations • 18 Jan 2023 • Megan M. Baker, Alexander New, Mario Aguilar-Simon, Ziad Al-Halah, Sébastien M. R. Arnold, Ese Ben-Iwhiwhu, Andrew P. Brna, Ethan Brooks, Ryan C. Brown, Zachary Daniels, Anurag Daram, Fabien Delattre, Ryan Dellana, Eric Eaton, Haotian Fu, Kristen Grauman, Jesse Hostetler, Shariq Iqbal, Cassandra Kent, Nicholas Ketz, Soheil Kolouri, George Konidaris, Dhireesha Kudithipudi, Erik Learned-Miller, Seungwon Lee, Michael L. Littman, Sandeep Madireddy, Jorge A. Mendez, Eric Q. Nguyen, Christine D. Piatko, Praveen K. Pilly, Aswin Raghavan, Abrar Rahman, Santhosh Kumar Ramakrishnan, Neale Ratzlaff, Andrea Soltoggio, Peter Stone, Indranil Sur, Zhipeng Tang, Saket Tiwari, Kyle Vedder, Felix Wang, Zifan Xu, Angel Yanguas-Gil, Harel Yedidsion, Shangqun Yu, Gautam K. Vallabha
Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to "real world" events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original training context, and systems will instead need to adapt to novel distributions and tasks while deployed.
1 code implementation • 7 Dec 2022 • Zhiyuan Zhou, Henry Sowerby, Michael L. Littman
Next, we propose an environment-independent tiered reward structure and show it is guaranteed to induce policies that are Pareto-optimal according to our preference relation.
1 code implementation • 26 Nov 2022 • Charles Lovering, Jessica Zosa Forde, George Konidaris, Ellie Pavlick, Michael L. Littman
AlphaZero, an approach to reinforcement learning that couples neural networks and Monte Carlo tree search (MCTS), has produced state-of-the-art strategies for traditional board games like chess, Go, shogi, and Hex.
no code implementations • 7 Nov 2022 • Lucas Lehnert, Michael J. Frank, Michael L. Littman
Recent advances in reinforcement-learning research have demonstrated impressive results in building algorithms that can out-perform humans in complex tasks.
no code implementations • 27 Oct 2022 • Michael L. Littman, Ifeoma Ajunwa, Guy Berger, Craig Boutilier, Morgan Currie, Finale Doshi-Velez, Gillian Hadfield, Michael C. Horowitz, Charles Isbell, Hiroaki Kitano, Karen Levy, Terah Lyons, Melanie Mitchell, Julie Shah, Steven Sloman, Shannon Vallor, Toby Walsh
In September 2021, the "One Hundred Year Study on Artificial Intelligence" project (AI100) issued the second report of its planned long-term periodic assessment of artificial intelligence (AI) and its impact on society.
no code implementations • 30 May 2022 • Henry Sowerby, Zhiyuan Zhou, Michael L. Littman
To solve this optimization problem, we propose a linear-programming based algorithm that efficiently finds a reward function that maximizes action gap and minimizes subjective discount.
no code implementations • 10 Dec 2021 • Kavosh Asadi, Rasool Fakoor, Omer Gottesman, Taesup Kim, Michael L. Littman, Alexander J. Smola
We employ Proximal Iteration for value-function optimization in deep reinforcement learning.
no code implementations • NeurIPS 2021 • David Abel, Will Dabney, Anna Harutyunyan, Mark K. Ho, Michael L. Littman, Doina Precup, Satinder Singh
We then provide a set of polynomial-time algorithms that construct a Markov reward function that allows an agent to optimize tasks of each of these three types, and correctly determine when no such reward function exists.
no code implementations • 7 Oct 2021 • David Abel, Cameron Allen, Dilip Arumugam, D. Ellis Hershkowitz, Michael L. Littman, Lawson L. S. Wong
We address this question by proposing a simple measure of reinforcement-learning hardness called the bad-policy density.
no code implementations • 15 Sep 2021 • Ishaan Shah, David Halpern, Kavosh Asadi, Michael L. Littman
We propose a variant of COACH, episodic COACH (E-COACH), which we prove converges for all three types.
no code implementations • 10 Jun 2021 • Jeff Druce, James Niehaus, Vanessa Moody, David Jensen, Michael L. Littman
The advances in artificial intelligence enabled by deep learning architectures are undeniable.
no code implementations • 14 May 2021 • Mark K. Ho, David Abel, Carlos G. Correa, Michael L. Littman, Jonathan D. Cohen, Thomas L. Griffiths
We propose a computational account of this simplification process and, in a series of pre-registered behavioral experiments, show that it is subject to online cognitive control and that people optimally balance the complexity of a task representation and its utility for planning and acting.
1 code implementation • 7 Aug 2020 • Mingxuan Li, Michael L. Littman
We demonstrate the potential of graph neural network in supporting sample efficient learning by showing that Deep Graph Value Network can outperform unstructured baselines by a large margin in solving the Markov Decision Process (MDP).
no code implementations • WS 2020 • Zachary Horvitz, Nam Do, Michael L. Littman
While mysterious, humor likely hinges on an interplay of entities, their relationships, and cultural connotations.
no code implementations • 13 Feb 2020 • Mark K. Ho, David Abel, Jonathan D. Cohen, Michael L. Littman, Thomas L. Griffiths
Thus, people should plan their actions, but they should also be smart about how they deploy resources used for planning their actions.
2 code implementations • 8 Feb 2020 • Kavosh Asadi, David Abel, Michael L. Littman
In this work, we answer this question in the affirmative, where we take "simple learning algorithm" to be tabular Q-Learning, the "good representations" to be a learned state abstraction, and "challenging problems" to be continuous control tasks.
no code implementations • 5 Feb 2020 • Kavosh Asadi, Neev Parikh, Ronald E. Parr, George D. Konidaris, Michael L. Littman
We show that the maximum action-value with respect to a deep RBVF can be approximated easily and accurately.
1 code implementation • 15 Jan 2020 • Erwan Lecarpentier, David Abel, Kavosh Asadi, Yuu Jinnai, Emmanuel Rachelson, Michael L. Littman
We consider the problem of knowledge transfer when an agent is facing a series of Reinforcement Learning (RL) tasks.
1 code implementation • 8 Dec 2019 • John R. Zech, Jessica Zosa Forde, Michael L. Littman
Averaging predictions from 10 models reduced variability by nearly 70% (mean coefficient of variation from 0. 543 to 0. 169, t-test 15. 96, p-value < 0. 0001).
no code implementations • 23 Aug 2019 • Matt Cooper, Jun Ki Lee, Jacob Beck, Joshua D. Fishman, Michael Gillett, Zoë Papakipos, Aaron Zhang, Jerome Ramos, Aansh Shah, Michael L. Littman
This idea generalizes the concept of a Stackelberg equilibrium.
no code implementations • 19 Jul 2019 • Robert Loftin, Bei Peng, Matthew E. Taylor, Michael L. Littman, David L. Roberts
In order for robots and other artificial agents to efficiently learn to perform useful tasks defined by an end user, they must understand not only the goals of those tasks, but also the structure and dynamics of that user's environment.
no code implementations • 12 Feb 2019 • Dilip Arumugam, Jun Ki Lee, Sophie Saskin, Michael L. Littman
To widen their accessibility and increase their utility, intelligent agents must be able to learn complex behaviors as specified by (non-expert) human users.
no code implementations • 31 Jan 2019 • Lucas Lehnert, Michael L. Littman
A key question in reinforcement learning is how an intelligent agent can generalize knowledge across different inputs.
no code implementations • 18 Jan 2019 • Michael Shum, Max Kleiman-Weiner, Michael L. Littman, Joshua B. Tenenbaum
This representation is grounded in the formalism of stochastic games and multi-agent reinforcement learning.
no code implementations • 3 Dec 2018 • Dilip Arumugam, David Abel, Kavosh Asadi, Nakul Gopalan, Christopher Grimm, Jun Ki Lee, Lucas Lehnert, Michael L. Littman
An agent with an inaccurate model of its environment faces a difficult choice: it can ignore the errors in its model and act in the real world in whatever way it determines is optimal with respect to its model.
Model-based Reinforcement Learning
reinforcement-learning
+1
no code implementations • 31 Oct 2018 • Kavosh Asadi, Evan Cater, Dipendra Misra, Michael L. Littman
When environmental interaction is expensive, model-based reinforcement learning offers a solution by planning ahead and avoiding costly mistakes.
Model-based Reinforcement Learning
reinforcement-learning
+1
no code implementations • 4 Jul 2018 • Lucas Lehnert, Michael L. Littman
Further, we present a Successor Feature model which shows that learning Successor Features is equivalent to learning a Model-Reduction.
no code implementations • 1 Jun 2018 • Kavosh Asadi, Evan Cater, Dipendra Misra, Michael L. Littman
Learning a generative model is a key component of model-based reinforcement learning.
Model-based Reinforcement Learning
reinforcement-learning
+1
1 code implementation • ICML 2018 • Kavosh Asadi, Dipendra Misra, Michael L. Littman
We go on to prove an error bound for the value-function estimate arising from Lipschitz models and show that the estimated value function is itself Lipschitz.
Model-based Reinforcement Learning
reinforcement-learning
+1
1 code implementation • 26 Oct 2017 • Yuhang Song, Christopher Grimm, Xianming Wang, Michael L. Littman
We examine the problem of learning mappings from state to state, suitable for use in a model-based reinforcement-learning setting, that simultaneously generalize to novel states and can capture stochastic transitions.
Model-based Reinforcement Learning
reinforcement-learning
+1
no code implementations • 19 Sep 2017 • Christopher Grimm, Yuhang Song, Michael L. Littman
Generative adversarial networks (GANs) are an exciting alternative to algorithms for solving density estimation problems---using data to assess how likely samples are to be drawn from the same distribution.
no code implementations • 31 Jul 2017 • Lucas Lehnert, Stefanie Tellex, Michael L. Littman
One question central to Reinforcement Learning is how to learn a feature representation that supports algorithm scaling and re-use of learned information from different tasks.
no code implementations • ICLR 2018 • Christopher Grimm, Dilip Arumugam, Siddharth Karamcheti, David Abel, Lawson L. S. Wong, Michael L. Littman
Deep neural networks are able to solve tasks across a variety of domains and modalities of data.
no code implementations • 14 Apr 2017 • Michael L. Littman, Ufuk Topcu, Jie Fu, Charles Isbell, Min Wen, James Macglashan
We propose a new task-specification language for Markov decision processes that is designed to be an improvement over reward functions by being environment independent.
no code implementations • ICML 2017 • James MacGlashan, Mark K. Ho, Robert Loftin, Bei Peng, Guan Wang, David Roberts, Matthew E. Taylor, Michael L. Littman
This paper investigates the problem of interactively learning behaviors communicated by a human teacher using positive and negative feedback.
1 code implementation • 15 Jan 2017 • David Abel, D. Ellis Hershkowitz, Michael L. Littman
The combinatorial explosion that plagues planning and reinforcement learning (RL) algorithms can be moderated using state abstraction.
1 code implementation • ICML 2017 • Kavosh Asadi, Michael L. Littman
A softmax operator applied to a set of values acts somewhat like the maximization function and somewhat like an average.
no code implementations • 6 Feb 2013 • Anthony R. Cassandra, Michael L. Littman, Nevin Lianwen Zhang
Most exact algorithms for general partially observable Markov decision processes (POMDPs) use a form of dynamic programming in which a piecewise-linear and convex representation of one value function is transformed into another.
no code implementations • 10 Jan 2013 • Michael Kearns, Michael L. Littman, Satinder Singh
The interpretation is that the payoff to player i is determined entirely by the actions of player i and his neighbors in the graph, and thus the payoff matrix to player i is indexed only by these players.
no code implementations • NeurIPS 2008 • Ali Nouri, Michael L. Littman
The essence of exploration is acting to try to decrease uncertainty.
1 code implementation • ICML '08: Proceedings of the 25th international conference on Machine learning 2008 • Carlos Diuk, Andre Cohen, Michael L. Littman
Rich representations in reinforcement learning have been studied for the purpose of enabling generalization and making learning feasible in large state spaces.
no code implementations • NeurIPS 2007 • Alexander L. Strehl, Michael L. Littman
We provide a provably efficient algorithm for learning Markov Decision Processes (MDPs) with continuous state and action spaces in the online setting.