You need to log in to edit.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

no code implementations • 11 May 2022 • Zhiyuan Zhou, Cameron Allen, Kavosh Asadi, George Konidaris

We study the action generalization ability of deep Q-learning in discrete action spaces.

no code implementations • 10 Dec 2021 • Kavosh Asadi, Rasool Fakoor, Omer Gottesman, Taesup Kim, Michael L. Littman, Alexander J. Smola

We employ Proximal Iteration for value-function optimization in deep reinforcement learning.

no code implementations • 23 Oct 2021 • Omer Gottesman, Kavosh Asadi, Cameron Allen, Sam Lobel, George Konidaris, Michael Littman

We propose a new coarse-grained smoothness definition that generalizes the notion of Lipschitz continuity, is more widely applicable, and allows us to compute significantly tighter bounds on Q-functions, leading to improved learning.

no code implementations • 15 Sep 2021 • Ishaan Shah, David Halpern, Kavosh Asadi, Michael L. Littman

We propose a variant of COACH, episodic COACH (E-COACH), which we prove converges for all three types.

1 code implementation • NeurIPS 2021 • Rasool Fakoor, Jonas Mueller, Kavosh Asadi, Pratik Chaudhari, Alexander J. Smola

Reliant on too many experiments to learn good actions, current Reinforcement Learning (RL) algorithms have limited applicability in real-world settings, which can be too expensive to allow exploration.

2 code implementations • 8 Feb 2020 • Kavosh Asadi, David Abel, Michael L. Littman

In this work, we answer this question in the affirmative, where we take "simple learning algorithm" to be tabular Q-Learning, the "good representations" to be a learned state abstraction, and "challenging problems" to be continuous control tasks.

no code implementations • 5 Feb 2020 • Kavosh Asadi, Neev Parikh, Ronald E. Parr, George D. Konidaris, Michael L. Littman

We show that the maximum action-value with respect to a deep RBVF can be approximated easily and accurately.

1 code implementation • 15 Jan 2020 • Erwan Lecarpentier, David Abel, Kavosh Asadi, Yuu Jinnai, Emmanuel Rachelson, Michael L. Littman

We consider the problem of knowledge transfer when an agent is facing a series of Reinforcement Learning (RL) tasks.

no code implementations • 30 May 2019 • Kavosh Asadi, Dipendra Misra, Seungchan Kim, Michel L. Littman

In this paper, we address the compounding-error problem by introducing a multi-step model that directly outputs the outcome of executing a sequence of actions.

no code implementations • 3 Dec 2018 • Dilip Arumugam, David Abel, Kavosh Asadi, Nakul Gopalan, Christopher Grimm, Jun Ki Lee, Lucas Lehnert, Michael L. Littman

An agent with an inaccurate model of its environment faces a difficult choice: it can ignore the errors in its model and act in the real world in whatever way it determines is optimal with respect to its model.

no code implementations • 31 Oct 2018 • Kavosh Asadi, Evan Cater, Dipendra Misra, Michael L. Littman

When environmental interaction is expensive, model-based reinforcement learning offers a solution by planning ahead and avoiding costly mistakes.

no code implementations • 1 Jun 2018 • Kavosh Asadi, Evan Cater, Dipendra Misra, Michael L. Littman

Learning a generative model is a key component of model-based reinforcement learning.

1 code implementation • ICML 2018 • Kavosh Asadi, Dipendra Misra, Michael L. Littman

We go on to prove an error bound for the value-function estimate arising from Lipschitz models and show that the estimated value function is itself Lipschitz.

2 code implementations • 1 Sep 2017 • Cameron Allen, Kavosh Asadi, Melrose Roderick, Abdel-rahman Mohamed, George Konidaris, Michael Littman

We propose a new algorithm, Mean Actor-Critic (MAC), for discrete-action continuous-state reinforcement learning.

Ranked #1 on Continuous Control on Cart Pole (OpenAI Gym)

2 code implementations • ACL 2017 • Jason D. Williams, Kavosh Asadi, Geoffrey Zweig

End-to-end learning of recurrent neural networks (RNNs) is an attractive solution for dialog systems; however, current techniques are data-intensive and require thousands of dialogs to learn simple behaviors.

no code implementations • 18 Dec 2016 • Kavosh Asadi, Jason D. Williams

Representing a dialog policy as a recurrent neural network (RNN) is attractive because it handles partial observability, infers a latent representation of state, and can be optimized with supervised learning (SL) or reinforcement learning (RL).

1 code implementation • ICML 2017 • Kavosh Asadi, Michael L. Littman

A softmax operator applied to a set of values acts somewhat like the maximization function and somewhat like an average.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.