Search Results for author: Konrad Czechowski

Found 7 papers, 6 papers with code

Subgoal Search For Complex Reasoning Tasks

1 code implementation NeurIPS 2021 Konrad Czechowski, Tomasz Odrzygóźdź, Marek Zbysiński, Michał Zawalski, Krzysztof Olejnik, Yuhuai Wu, Łukasz Kuciński, Piotr Miłoś

In this paper, we implement kSubS using a transformer-based subgoal module coupled with the classical best-first search framework.

Model Based Reinforcement Learning for Atari

no code implementations ICLR 2020 Łukasz Kaiser, Mohammad Babaeizadeh, Piotr Miłos, Błażej Osiński, Roy H. Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski

We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting.

Atari Games Model-based Reinforcement Learning +1

Uncertainty-sensitive Learning and Planning with Ensembles

1 code implementation19 Dec 2019 Piotr Miłoś, Łukasz Kuciński, Konrad Czechowski, Piotr Kozakowski, Maciek Klimek

The former manifests itself through the use of value function, while the latter is powered by a tree search planner.

Montezuma's Revenge

Uncertainty - sensitive learning and planning with ensembles

1 code implementation25 Sep 2019 Piotr Miłoś, Łukasz Kuciński, Konrad Czechowski, Piotr Kozakowski, Maciej Klimek

Notably, our method performs well in environments with sparse rewards where standard $TD(1)$ backups fail.

Montezuma's Revenge

Model-Based Reinforcement Learning for Atari

4 code implementations1 Mar 2019 Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H. Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski

We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting.

Atari Games Atari Games 100k +2

Cannot find the paper you are looking for? You can Submit a new open access paper.