no code implementations • 5 Dec 2023 • Zhengyao Jiang, Yingchen Xu, Nolan Wagener, Yicheng Luo, Michael Janner, Edward Grefenstette, Tim Rocktäschel, Yuandong Tian
However, the extensive collection of human motion-captured data and the derived datasets of humanoid trajectories, such as MoCapAct, paves the way to tackle these challenges.
no code implementations • 15 Jun 2023 • Michael Janner
Deep model-based reinforcement learning methods offer a conceptually simple approach to the decision-making and control problem: use learning for the purpose of estimating an approximate dynamics model, and offload the rest of the work to classical trajectory optimization.
2 code implementations • 22 May 2023 • Kevin Black, Michael Janner, Yilun Du, Ilya Kostrikov, Sergey Levine
However, most use cases of diffusion models are not concerned with likelihoods, but instead with downstream objectives such as human-perceived image quality or drug effectiveness.
1 code implementation • 20 Apr 2023 • Philippe Hansen-Estruch, Ilya Kostrikov, Michael Janner, Jakub Grudzien Kuba, Sergey Levine
In this paper, we reinterpret IQL as an actor-critic method by generalizing the critic objective and connecting it to a behavior-regularized implicit actor.
1 code implementation • 22 Aug 2022 • Zhengyao Jiang, Tianjun Zhang, Michael Janner, Yueying Li, Tim Rocktäschel, Edward Grefenstette, Yuandong Tian
Planning-based reinforcement learning has shown strong performance in tasks in discrete and low-dimensional continuous action spaces.
no code implementations • 21 Jun 2022 • Katie Kang, Paula Gradu, Jason Choi, Michael Janner, Claire Tomlin, Sergey Levine
Learned models and policies can generalize effectively when evaluated within the distribution of the training data, but can produce unpredictable and erroneous outputs on out-of-distribution inputs.
2 code implementations • 20 May 2022 • Michael Janner, Yilun Du, Joshua B. Tenenbaum, Sergey Levine
Model-based reinforcement learning methods often use learning only for the purpose of estimating an approximate dynamics model, offloading the rest of the decision-making work to classical trajectory optimizers.
1 code implementation • ICML Workshop URL 2021 • Michael Janner, Qiyang Li, Sergey Levine
However, we can also view RL as a sequence modeling problem, with the goal being to predict a sequence of actions that leads to a sequence of high rewards.
2 code implementations • NeurIPS 2021 • Michael Janner, Qiyang Li, Sergey Levine
Reinforcement learning (RL) is typically concerned with estimating stationary policies or single-step models, leveraging the Markov property to factorize problems in time.
1 code implementation • NeurIPS 2020 • Michael Janner, Igor Mordatch, Sergey Levine
We introduce the gamma-model, a predictive model of environment dynamics with an infinite, probabilistic horizon.
1 code implementation • 27 Oct 2020 • Michael Janner, Igor Mordatch, Sergey Levine
We introduce the $\gamma$-model, a predictive model of environment dynamics with an infinite probabilistic horizon.
1 code implementation • 28 Oct 2019 • Rishi Veerapaneni, John D. Co-Reyes, Michael Chang, Michael Janner, Chelsea Finn, Jiajun Wu, Joshua B. Tenenbaum, Sergey Levine
This paper tests the hypothesis that modeling a scene in terms of entities and their local interactions, as opposed to modeling the scene globally, provides a significant benefit in generalizing to physical tasks in a combinatorial space the learner has not encountered before.
11 code implementations • NeurIPS 2019 • Michael Janner, Justin Fu, Marvin Zhang, Sergey Levine
Designing effective model-based reinforcement learning algorithms is difficult because the ease of data generation must be weighed against the bias of model-generated data.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • ICLR 2019 • Michael Janner, Sergey Levine, William T. Freeman, Joshua B. Tenenbaum, Chelsea Finn, Jiajun Wu
Object-based factorizations provide a useful level of abstraction for interacting with the world.
no code implementations • 28 Dec 2018 • Michael Janner, Sergey Levine, William T. Freeman, Joshua B. Tenenbaum, Chelsea Finn, Jiajun Wu
Object-based factorizations provide a useful level of abstraction for interacting with the world.
no code implementations • NeurIPS 2017 • Michael Janner, Jiajun Wu, Tejas D. Kulkarni, Ilker Yildirim, Joshua B. Tenenbaum
Intrinsic decomposition from a single image is a highly challenging task, due to its inherent ambiguity and the scarcity of training data.
1 code implementation • TACL 2018 • Michael Janner, Karthik Narasimhan, Regina Barzilay
The interpretation of spatial references is highly contextual, requiring joint inference over both language and the environment.