no code implementations • NAACL (BEA) 2022 • Greg Keim, Michael Littman
We explore a novel approach to reading compliance, leveraging large language models to select inline challenges that discourage skipping during reading.
no code implementations • NAACL 2022 • Cynthia Sullivan, William Brackenbury, Andrew McNut, Kevin Bryson, Kbyllofficial@gmail.com Kbyllofficial@gmail.com, Yuxin Chen, Michael Littman, Chenhao Tan, Blase Ur
In the context of data labeling, NLP researchers are increasingly interested in having humans select rationales, a subset of input tokens relevant to the chosen label.
no code implementations • 9 Mar 2023 • Cambridge Yang, Michael Littman, Michael Carbin
In particular, for the analysis that considers only sample complexity, we prove that if an objective given as an oracle is uniformly continuous, then it is PAC-learnable.
2 code implementations • 20 Oct 2022 • Haotian Fu, Shangqun Yu, Michael Littman, George Konidaris
We propose a model-based lifelong reinforcement-learning approach that estimates a hierarchical Bayesian posterior distilling the common structure shared across different tasks.
1 code implementation • 7 Jun 2022 • Haotian Fu, Shangqun Yu, Saket Tiwari, Michael Littman, George Konidaris
We propose a novel parameterized skill-learning algorithm that aims to learn transferable parameterized skills and synthesize them into a new action space that supports efficient learning in long-horizon tasks.
no code implementations • 20 Mar 2022 • Bowen He, Sreehari Rammohan, Jessica Forde, Michael Littman
In this work, we study two self-play training schemes, Chainer and Pool, and show they lead to improved agent performance in Atari Pong compared to a standard DQN agent -- trained against the built-in Atari opponent.
no code implementations • 9 Dec 2021 • Yiheng Xie, Mingxuan Li, Shangqun Yu, Michael Littman
Though deep reinforcement learning agents have achieved unprecedented success in recent years, their learned policies can be brittle, failing to generalize to even slight modifications of their environments or unfamiliar situations.
no code implementations • 24 Nov 2021 • Cambridge Yang, Michael Littman, Michael Carbin
In recent years, researchers have made significant progress in devising reinforcement-learning algorithms for optimizing linear temporal logic (LTL) objectives and LTL-like objectives.
no code implementations • AAAI Workshop CLeaR 2022 • Cambridge Yang, Michael Littman, Michael Carbin
In recent years, researchers have made significant progress in devising reinforcement-learning algorithms for optimizing linear temporal logic (LTL) objectives and LTL-like objectives.
no code implementations • 7 Nov 2021 • Homer Walke, Daniel Ritter, Carl Trimbach, Michael Littman
Finite linear temporal logic ($\mathsf{LTL}_f$) is a powerful formal representation for modeling temporal sequences.
no code implementations • 23 Oct 2021 • Omer Gottesman, Kavosh Asadi, Cameron Allen, Sam Lobel, George Konidaris, Michael Littman
We propose a new coarse-grained smoothness definition that generalizes the notion of Lipschitz continuity, is more widely applicable, and allows us to compute significantly tighter bounds on Q-functions, leading to improved learning.
no code implementations • 29 Sep 2021 • Haotian Fu, Shangqun Yu, Michael Littman, George Konidaris
A central question in reinforcement learning (RL) is how to leverage prior knowledge to accelerate learning in new tasks.
no code implementations • 1 Apr 2021 • Jessica Zosa Forde, A. Feder Cooper, Kweku Kwegyir-Aggrey, Chris De Sa, Michael Littman
Algorithmic fairness has emphasized the role of biased data in automated decision outcomes.
no code implementations • 17 Oct 2020 • Michael Fishman, Nishanth Kumar, Cameron Allen, Natasha Danas, Michael Littman, Stefanie Tellex, George Konidaris
Unfortunately, planning to solve any specific task using an open-scope model is computationally intractable - even for state-of-the-art methods - due to the many states and actions that are necessarily present in the model but irrelevant to that problem.
no code implementations • 14 Mar 2019 • Carl Trimbach, Michael Littman
Like many problems in AI in their general form, supervised learning is computationally intractable.
no code implementations • ICLR 2019 • Jacob Beck, Zoe Papakipos, Michael Littman
Our framework learns continuous control from sub-optimal demonstration and evaluative feedback collected before training.
no code implementations • 7 Dec 2018 • Sam Witty, Jun Ki Lee, Emma Tosch, Akanksha Atrey, Michael Littman, David Jensen
We re-examine what is meant by generalization in RL, and propose several definitions based on an agent's performance in on-policy, off-policy, and unreachable states.
no code implementations • 16 Oct 2018 • Yuu Jinnai, David Abel, D. Ellis Hershkowitz, Michael Littman, George Konidaris
We formalize the problem of selecting the optimal set of options for planning as that of computing the smallest set of options so that planning converges in less than a given maximum of value-iteration passes.
no code implementations • 24 Sep 2018 • Sam Saarinen, Evan Cater, Michael Littman
Tailoring the presentation of information to the needs of individual students leads to massive gains in student outcomes~\cite{bloom19842}.
no code implementations • ICML 2018 • David Abel, Dilip Arumugam, Lucas Lehnert, Michael Littman
We introduce two new classes of abstractions: (1) transitive state abstractions, whose optimal form can be computed efficiently, and (2) PAC state abstractions, which are guaranteed to hold with respect to a distribution of tasks.
no code implementations • ICML 2018 • David Abel, Yuu Jinnai, Sophie Yue Guo, George Konidaris, Michael Littman
We consider the problem of how best to use prior experience to bootstrap lifelong learning, where an agent faces a series of task instances drawn from some task distribution.
2 code implementations • 1 Sep 2017 • Cameron Allen, Kavosh Asadi, Melrose Roderick, Abdel-rahman Mohamed, George Konidaris, Michael Littman
We propose a new algorithm, Mean Actor-Critic (MAC), for discrete-action continuous-state reinforcement learning.
Ranked #1 on Continuous Control on Cart Pole (OpenAI Gym)
no code implementations • NeurIPS 2016 • Mark K. Ho, Michael Littman, James Macglashan, Fiery Cushman, Joseph L. Austerweil
Stark differences arise when demonstrators are intentionally teaching a task versus simply performing a task.