Search Results for author: Dilip Arumugam

Found 22 papers, 4 papers with code

Social Contract AI: Aligning AI Assistants with Implicit Group Norms

1 code implementation26 Oct 2023 Jan-Philipp Fränken, Sam Kwok, Peixuan Ye, Kanishk Gandhi, Dilip Arumugam, Jared Moore, Alex Tamkin, Tobias Gerstenberg, Noah D. Goodman

We explore the idea of aligning an AI assistant by inverting a model of users' (unknown) preferences from observed interactions.

Hindsight-DICE: Stable Credit Assignment for Deep Reinforcement Learning

1 code implementation21 Jul 2023 Akash Velu, Skanda Vaidyanath, Dilip Arumugam

Oftentimes, environments for sequential decision-making problems can be quite sparse in the provision of evaluative feedback to guide reinforcement-learning agents.

Decision Making Off-policy evaluation +2

Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models

no code implementations19 May 2023 Wanqiao Xu, Shi Dong, Dilip Arumugam, Benjamin Van Roy

In this work, we adopt a novel perspective wherein a pre-trained language model is itself simultaneously a policy, reward function, and transition function.

Efficient Exploration Language Modelling +2

Bayesian Reinforcement Learning with Limited Cognitive Load

no code implementations5 May 2023 Dilip Arumugam, Mark K. Ho, Noah D. Goodman, Benjamin Van Roy

All biological and artificial agents must learn and make decisions given limits on their ability to process information.

Decision Making reinforcement-learning

Inclusive Artificial Intelligence

no code implementations24 Dec 2022 Dilip Arumugam, Shi Dong, Benjamin Van Roy

Prevailing methods for assessing and comparing generative AIs incentivize responses that serve a hypothetical representative individual.

On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning

no code implementations30 Oct 2022 Dilip Arumugam, Mark K. Ho, Noah D. Goodman, Benjamin Van Roy

Throughout the cognitive-science literature, there is widespread agreement that decision-making agents operating in the real world do so under limited information-processing capabilities and without access to unbounded cognitive or computational resources.

Decision Making reinforcement-learning +1

Planning to the Information Horizon of BAMDPs via Epistemic State Abstraction

no code implementations30 Oct 2022 Dilip Arumugam, Satinder Singh

The Bayes-Adaptive Markov Decision Process (BAMDP) formalism pursues the Bayes-optimal solution to the exploration-exploitation trade-off in reinforcement learning.

Efficient Exploration reinforcement-learning +1

Between Rate-Distortion Theory & Value Equivalence in Model-Based Reinforcement Learning

no code implementations4 Jun 2022 Dilip Arumugam, Benjamin Van Roy

The quintessential model-based reinforcement-learning agent iteratively refines its estimates or prior beliefs about the true underlying model of the environment.

Decision Making Model-based Reinforcement Learning +2

Deciding What to Model: Value-Equivalent Sampling for Reinforcement Learning

no code implementations4 Jun 2022 Dilip Arumugam, Benjamin Van Roy

To address this problem, we introduce an algorithm that, using rate-distortion theory, iteratively computes an approximately-value-equivalent, lossy compression of the environment which an agent may feasibly target in lieu of the true model.

Decision Making Model-based Reinforcement Learning +2

The Value of Information When Deciding What to Learn

no code implementations NeurIPS 2021 Dilip Arumugam, Benjamin Van Roy

All sequential decision-making agents explore so as to acquire knowledge about a particular target.

Decision Making

Deciding What to Learn: A Rate-Distortion Approach

no code implementations15 Jan 2021 Dilip Arumugam, Benjamin Van Roy

Agents that learn to select optimal actions represent a prominent focus of the sequential decision-making literature.

Decision Making Thompson Sampling

Randomized Value Functions via Posterior State-Abstraction Sampling

no code implementations5 Oct 2020 Dilip Arumugam, Benjamin Van Roy

State abstraction has been an essential tool for dramatically improving the sample efficiency of reinforcement-learning algorithms.

Reparameterized Variational Divergence Minimization for Stable Imitation

no code implementations18 Jun 2020 Dilip Arumugam, Debadeepta Dey, Alekh Agarwal, Asli Celikyilmaz, Elnaz Nouri, Bill Dolan

While recent state-of-the-art results for adversarial imitation-learning algorithms are encouraging, recent works exploring the imitation learning from observation (ILO) setting, where trajectories \textit{only} contain expert observations, have not been met with the same success.

Continuous Control Imitation Learning

Flexible and Efficient Long-Range Planning Through Curious Exploration

no code implementations ICML 2020 Aidan Curtis, Minjian Xin, Dilip Arumugam, Kevin Feigelis, Daniel Yamins

In contrast, deep reinforcement learning (DRL) methods use flexible neural-network-based function approximators to discover policies that generalize naturally to unseen circumstances.

Imitation Learning Model-based Reinforcement Learning +4

Deep Reinforcement Learning from Policy-Dependent Human Feedback

no code implementations12 Feb 2019 Dilip Arumugam, Jun Ki Lee, Sophie Saskin, Michael L. Littman

To widen their accessibility and increase their utility, intelligent agents must be able to learn complex behaviors as specified by (non-expert) human users.

reinforcement-learning Reinforcement Learning (RL)

Mitigating Planner Overfitting in Model-Based Reinforcement Learning

no code implementations3 Dec 2018 Dilip Arumugam, David Abel, Kavosh Asadi, Nakul Gopalan, Christopher Grimm, Jun Ki Lee, Lucas Lehnert, Michael L. Littman

An agent with an inaccurate model of its environment faces a difficult choice: it can ignore the errors in its model and act in the real world in whatever way it determines is optimal with respect to its model.

Model-based Reinforcement Learning Position +2

State Abstractions for Lifelong Reinforcement Learning

no code implementations ICML 2018 David Abel, Dilip Arumugam, Lucas Lehnert, Michael Littman

We introduce two new classes of abstractions: (1) transitive state abstractions, whose optimal form can be computed efficiently, and (2) PAC state abstractions, which are guaranteed to hold with respect to a distribution of tasks.

reinforcement-learning Reinforcement Learning (RL)

Accurately and Efficiently Interpreting Human-Robot Instructions of Varying Granularities

1 code implementation21 Apr 2017 Dilip Arumugam, Siddharth Karamcheti, Nakul Gopalan, Lawson L. S. Wong, Stefanie Tellex

In this work, by grounding commands to all the tasks or subtasks available in a hierarchical planning framework, we arrive at a model capable of interpreting language at multiple levels of specificity ranging from coarse to more granular.

Specificity

Cannot find the paper you are looking for? You can Submit a new open access paper.