no code implementations • ICML 2020 • Ashley Edwards, Himanshu Sahni, Rosanne Liu, Jane Hung, Ankit Jain, Rui Wang, Adrien Ecoffet, Thomas Miconi, Charles Isbell, Jason Yosinski
In this paper, we introduce a novel form of a value function, $Q(s, s')$, that expresses the utility of transitioning from a state $s$ to a neighboring state $s'$ and then acting optimally thereafter.
no code implementations • 14 Dec 2023 • Kate Baumli, Satinder Baveja, Feryal Behbahani, Harris Chan, Gheorghe Comanici, Sebastian Flennerhag, Maxime Gazeau, Kristian Holsheimer, Dan Horgan, Michael Laskin, Clare Lyle, Hussain Masoom, Kay McKinney, Volodymyr Mnih, Alexander Neitz, Dmitry Nikulin, Fabio Pardo, Jack Parker-Holder, John Quan, Tim Rocktäschel, Himanshu Sahni, Tom Schaul, Yannick Schroecker, Stephen Spencer, Richie Steigerwald, Luyu Wang, Lei Zhang
Building generalist agents that can accomplish many goals in rich open-ended environments is one of the research frontiers for reinforcement learning.
1 code implementation • 25 Oct 2022 • Michael Laskin, Luyu Wang, Junhyuk Oh, Emilio Parisotto, Stephen Spencer, Richie Steigerwald, DJ Strouse, Steven Hansen, Angelos Filos, Ethan Brooks, Maxime Gazeau, Himanshu Sahni, Satinder Singh, Volodymyr Mnih
We propose Algorithm Distillation (AD), a method for distilling reinforcement learning (RL) algorithms into neural networks by modeling their training histories with a causal sequence model.
no code implementations • 10 Mar 2021 • Himanshu Sahni, Charles Isbell
We also show that the agent's internal representation of the surroundings, a live mental map, can be used for control in two partially observable reinforcement learning tasks.
1 code implementation • 21 Feb 2020 • Ashley D. Edwards, Himanshu Sahni, Rosanne Liu, Jane Hung, Ankit Jain, Rui Wang, Adrien Ecoffet, Thomas Miconi, Charles Isbell, Jason Yosinski
In this paper, we introduce a novel form of value function, $Q(s, s')$, that expresses the utility of transitioning from a state $s$ to a neighboring state $s'$ and then acting optimally thereafter.
2 code implementations • NeurIPS 2019 • Himanshu Sahni, Toby Buckley, Pieter Abbeel, Ilya Kuzovkin
In this work, we show how visual trajectories can be hallucinated to appear successful by altering agent observations using a generative model trained on relatively few snapshots of the goal.
2 code implementations • 21 May 2018 • Ashley D. Edwards, Himanshu Sahni, Yannick Schroecker, Charles L. Isbell
In this paper, we describe a novel approach to imitation learning that infers latent policies directly from state observations.
no code implementations • 30 Nov 2017 • Himanshu Sahni, Saurabh Kumar, Farhan Tejani, Charles Isbell
We present a differentiable framework capable of learning a wide variety of compositions of simple policies that we call skills.
no code implementations • 24 May 2017 • Himanshu Sahni, Saurabh Kumar, Farhan Tejani, Yannick Schroecker, Charles Isbell
To address this issue, we develop a framework through which a deep RL agent learns to generalize policies from smaller, simpler domains to more complex ones using a recurrent attention mechanism.