Search Results for author: Konrad Zolna

Found 23 papers, 9 papers with code

Critic Regularized Regression

5 code implementations NeurIPS 2020 Ziyu Wang, Alexander Novikov, Konrad Zolna, Jost Tobias Springenberg, Scott Reed, Bobak Shahriari, Noah Siegel, Josh Merel, Caglar Gulcehre, Nicolas Heess, Nando de Freitas

Offline reinforcement learning (RL), also known as batch RL, offers the prospect of policy optimization from large pre-recorded datasets without online environment interaction.

Offline RL regression +1

RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning

2 code implementations24 Jun 2020 Caglar Gulcehre, Ziyu Wang, Alexander Novikov, Tom Le Paine, Sergio Gomez Colmenarejo, Konrad Zolna, Rishabh Agarwal, Josh Merel, Daniel Mankowitz, Cosmin Paduraru, Gabriel Dulac-Arnold, Jerry Li, Mohammad Norouzi, Matt Hoffman, Ofir Nachum, George Tucker, Nicolas Heess, Nando de Freitas

We hope that our suite of benchmarks will increase the reproducibility of experiments and make it possible to study challenging tasks with a limited computational budget, thus making RL research both more systematic and more accessible across the community.

Atari Games DQN Replay Dataset +3

RL Unplugged: A Collection of Benchmarks for Offline Reinforcement Learning

1 code implementation NeurIPS 2020 Caglar Gulcehre, Ziyu Wang, Alexander Novikov, Thomas Paine, Sergio Gómez, Konrad Zolna, Rishabh Agarwal, Josh S. Merel, Daniel J. Mankowitz, Cosmin Paduraru, Gabriel Dulac-Arnold, Jerry Li, Mohammad Norouzi, Matthew Hoffman, Nicolas Heess, Nando de Freitas

We hope that our suite of benchmarks will increase the reproducibility of experiments and make it possible to study challenging tasks with a limited computational budget, thus making RL research both more systematic and more accessible across the community.

Offline RL reinforcement-learning +1

Classifier-agnostic saliency map extraction

1 code implementation ICLR 2019 Konrad Zolna, Krzysztof J. Geras, Kyunghyun Cho

To address this problem, we propose classifier-agnostic saliency map extraction, which finds all parts of the image that any classifier could use, not just one given in advance.

General Classification

Fraternal Dropout

1 code implementation ICLR 2018 Konrad Zolna, Devansh Arpit, Dendi Suhubdy, Yoshua Bengio

We show that our regularization term is upper bounded by the expectation-linear dropout objective which has been shown to address the gap due to the difference between the train and inference phases of dropout.

Image Captioning Language Modelling

Improving the Performance of Neural Networks in Regression Tasks Using Drawering

no code implementations5 Dec 2016 Konrad Zolna

The method presented extends a given regression neural network to make its performance improve.

regression

The Dynamics of Handwriting Improves the Automated Diagnosis of Dysgraphia

no code implementations12 Jun 2019 Konrad Zolna, Thibault Asselborn, Caroline Jolly, Laurence Casteran, Marie-Ange~Nguyen-Morel, Wafa Johal, Pierre Dillenbourg

We show that incorporating the dynamic information available by the use of tablet is highly beneficial to our digital test to discriminate between typically-developing and dysgraphic children.

Task-Relevant Adversarial Imitation Learning

no code implementations2 Oct 2019 Konrad Zolna, Scott Reed, Alexander Novikov, Sergio Gomez Colmenarejo, David Budden, Serkan Cabi, Misha Denil, Nando de Freitas, Ziyu Wang

We show that a critical vulnerability in adversarial imitation is the tendency of discriminator networks to learn spurious associations between visual features and expert labels.

Imitation Learning

Reinforced Imitation Learning from Observations

no code implementations ICLR 2019 Konrad Zolna, Negar Rostamzadeh, Yoshua Bengio, Sungjin Ahn, Pedro O. Pinheiro

Imitation learning is an effective alternative approach to learn a policy when the reward function is sparse.

Imitation Learning

Combating False Negatives in Adversarial Imitation Learning

no code implementations2 Feb 2020 Konrad Zolna, Chitwan Saharia, Leonard Boussioux, David Yu-Tung Hui, Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Yoshua Bengio

In adversarial imitation learning, a discriminator is trained to differentiate agent episodes from expert demonstrations representing the desired behavior.

Imitation Learning

Hyperparameter Selection for Offline Reinforcement Learning

no code implementations17 Jul 2020 Tom Le Paine, Cosmin Paduraru, Andrea Michi, Caglar Gulcehre, Konrad Zolna, Alexander Novikov, Ziyu Wang, Nando de Freitas

Therefore, in this work, we focus on \textit{offline hyperparameter selection}, i. e. methods for choosing the best policy from a set of many policies trained using different hyperparameters, given only logged data.

Offline RL reinforcement-learning +1

Addressing Extrapolation Error in Deep Offline Reinforcement Learning

no code implementations1 Jan 2021 Caglar Gulcehre, Sergio Gómez Colmenarejo, Ziyu Wang, Jakub Sygnowski, Thomas Paine, Konrad Zolna, Yutian Chen, Matthew Hoffman, Razvan Pascanu, Nando de Freitas

These errors can be compounded by bootstrapping when the function approximator overestimates, leading the value function to *grow unbounded*, thereby crippling learning.

Offline RL reinforcement-learning +1

Offline Learning from Demonstrations and Unlabeled Experience

no code implementations27 Nov 2020 Konrad Zolna, Alexander Novikov, Ksenia Konyushkova, Caglar Gulcehre, Ziyu Wang, Yusuf Aytar, Misha Denil, Nando de Freitas, Scott Reed

Behavior cloning (BC) is often practical for robot learning because it allows a policy to be trained offline without rewards, by supervised learning on expert demonstrations.

Continuous Control Imitation Learning

Regularized Behavior Value Estimation

no code implementations17 Mar 2021 Caglar Gulcehre, Sergio Gómez Colmenarejo, Ziyu Wang, Jakub Sygnowski, Thomas Paine, Konrad Zolna, Yutian Chen, Matthew Hoffman, Razvan Pascanu, Nando de Freitas

Due to bootstrapping, these errors get amplified during training and can lead to divergence, thereby crippling learning.

Offline RL

GATS: Gather-Attend-Scatter

no code implementations16 Jan 2024 Konrad Zolna, Serkan Cabi, Yutian Chen, Eric Lau, Claudio Fantacci, Jurgis Pasukonis, Jost Tobias Springenberg, Sergio Gomez Colmenarejo

As the AI community increasingly adopts large-scale models, it is crucial to develop general and flexible tools to integrate them.

Cannot find the paper you are looking for? You can Submit a new open access paper.