no code implementations • 24 Dec 2022 • Dilip Arumugam, Shi Dong, Benjamin Van Roy
Prevailing methods for assessing and comparing generative AIs incentivize responses that serve a hypothetical representative individual.
no code implementations • 30 Oct 2022 • Dilip Arumugam, Satinder Singh
The Bayes-Adaptive Markov Decision Process (BAMDP) formalism pursues the Bayes-optimal solution to the exploration-exploitation trade-off in reinforcement learning.
no code implementations • 30 Oct 2022 • Dilip Arumugam, Mark K. Ho, Noah D. Goodman, Benjamin Van Roy
Throughout the cognitive-science literature, there is widespread agreement that decision-making agents operating in the real world do so under limited information-processing capabilities and without access to unbounded cognitive or computational resources.
no code implementations • 4 Jun 2022 • Dilip Arumugam, Benjamin Van Roy
The quintessential model-based reinforcement-learning agent iteratively refines its estimates or prior beliefs about the true underlying model of the environment.
no code implementations • 4 Jun 2022 • Dilip Arumugam, Benjamin Van Roy
To address this problem, we introduce an algorithm that, using rate-distortion theory, iteratively computes an approximately-value-equivalent, lossy compression of the environment which an agent may feasibly target in lieu of the true model.
no code implementations • NeurIPS 2021 • Dilip Arumugam, Benjamin Van Roy
All sequential decision-making agents explore so as to acquire knowledge about a particular target.
no code implementations • 7 Oct 2021 • David Abel, Cameron Allen, Dilip Arumugam, D. Ellis Hershkowitz, Michael L. Littman, Lawson L. S. Wong
We address this question by proposing a simple measure of reinforcement-learning hardness called the bad-policy density.
no code implementations • 10 Mar 2021 • Dilip Arumugam, Peter Henderson, Pierre-Luc Bacon
How do we formalize the challenge of credit assignment in reinforcement learning?
no code implementations • 15 Jan 2021 • Dilip Arumugam, Benjamin Van Roy
Agents that learn to select optimal actions represent a prominent focus of the sequential decision-making literature.
no code implementations • 5 Oct 2020 • Dilip Arumugam, Benjamin Van Roy
State abstraction has been an essential tool for dramatically improving the sample efficiency of reinforcement-learning algorithms.
no code implementations • 18 Jun 2020 • Dilip Arumugam, Debadeepta Dey, Alekh Agarwal, Asli Celikyilmaz, Elnaz Nouri, Bill Dolan
While recent state-of-the-art results for adversarial imitation-learning algorithms are encouraging, recent works exploring the imitation learning from observation (ILO) setting, where trajectories \textit{only} contain expert observations, have not been met with the same success.
no code implementations • ICML 2020 • Aidan Curtis, Minjian Xin, Dilip Arumugam, Kevin Feigelis, Daniel Yamins
In contrast, deep reinforcement learning (DRL) methods use flexible neural-network-based function approximators to discover policies that generalize naturally to unseen circumstances.
no code implementations • 12 Feb 2019 • Dilip Arumugam, Jun Ki Lee, Sophie Saskin, Michael L. Littman
To widen their accessibility and increase their utility, intelligent agents must be able to learn complex behaviors as specified by (non-expert) human users.
no code implementations • 3 Dec 2018 • Dilip Arumugam, David Abel, Kavosh Asadi, Nakul Gopalan, Christopher Grimm, Jun Ki Lee, Lucas Lehnert, Michael L. Littman
An agent with an inaccurate model of its environment faces a difficult choice: it can ignore the errors in its model and act in the real world in whatever way it determines is optimal with respect to its model.
Model-based Reinforcement Learning
reinforcement-learning
+1
no code implementations • ICML 2018 • David Abel, Dilip Arumugam, Lucas Lehnert, Michael Littman
We introduce two new classes of abstractions: (1) transitive state abstractions, whose optimal form can be computed efficiently, and (2) PAC state abstractions, which are guaranteed to hold with respect to a distribution of tasks.
1 code implementation • WS 2017 • Siddharth Karamcheti, Edward C. Williams, Dilip Arumugam, Mina Rhee, Nakul Gopalan, Lawson L. S. Wong, Stefanie Tellex
Robots operating alongside humans in diverse, stochastic environments must be able to accurately interpret natural language commands.
no code implementations • ICLR 2018 • Christopher Grimm, Dilip Arumugam, Siddharth Karamcheti, David Abel, Lawson L. S. Wong, Michael L. Littman
Deep neural networks are able to solve tasks across a variety of domains and modalities of data.
1 code implementation • 21 Apr 2017 • Dilip Arumugam, Siddharth Karamcheti, Nakul Gopalan, Lawson L. S. Wong, Stefanie Tellex
In this work, by grounding commands to all the tasks or subtasks available in a hierarchical planning framework, we arrive at a model capable of interpreting language at multiple levels of specificity ranging from coarse to more granular.