Search Results for author: Marcin Moczulski

Found 10 papers, 4 papers with code

Self-Imitation Learning via Trajectory-Conditioned Policy for Hard-Exploration Tasks

no code implementations25 Sep 2019 Yijie Guo, Jongwook Choi, Marcin Moczulski, Samy Bengio, Mohammad Norouzi, Honglak Lee

We propose a new method of learning a trajectory-conditioned policy to imitate diverse trajectories from the agent's own past experiences and show that such self-imitation helps avoid myopic behavior and increases the chance of finding a globally optimal solution for hard-exploration tasks, especially when there are misleading rewards.

Imitation Learning

Memory Based Trajectory-conditioned Policies for Learning from Sparse Rewards

no code implementations NeurIPS 2020 Yijie Guo, Jongwook Choi, Marcin Moczulski, Shengyu Feng, Samy Bengio, Mohammad Norouzi, Honglak Lee

Reinforcement learning with sparse rewards is challenging because an agent can rarely obtain non-zero rewards and hence, gradient-based optimization of parameterized policies can be incremental and slow.

Efficient Exploration Imitation Learning +1

Contingency-Aware Exploration in Reinforcement Learning

no code implementations ICLR 2019 Jongwook Choi, Yijie Guo, Marcin Moczulski, Junhyuk Oh, Neal Wu, Mohammad Norouzi, Honglak Lee

This paper investigates whether learning contingency-awareness and controllable aspects of an environment can lead to better exploration in reinforcement learning.

Montezuma's Revenge reinforcement-learning

A Robust Adaptive Stochastic Gradient Method for Deep Learning

1 code implementation2 Mar 2017 Caglar Gulcehre, Jose Sotelo, Marcin Moczulski, Yoshua Bengio

The information about the element-wise curvature of the loss function is estimated from the local statistics of the stochastic first order gradients.

Mollifying Networks

no code implementations17 Aug 2016 Caglar Gulcehre, Marcin Moczulski, Francesco Visin, Yoshua Bengio

The optimization of deep neural networks can be more challenging than traditional convex optimization problems due to the highly non-convex nature of the loss function, e. g. it can involve pathological landscapes such as saddle-surfaces that can be difficult to escape for algorithms based on simple gradient descent.

Noisy Activation Functions

1 code implementation1 Mar 2016 Caglar Gulcehre, Marcin Moczulski, Misha Denil, Yoshua Bengio

Common nonlinear activation functions used in neural networks can cause training difficulties due to the saturation behavior of the activation function, which may hide dependencies that are not visible to vanilla-SGD (using first order gradients only).

A Controller-Recognizer Framework: How necessary is recognition for control?

no code implementations19 Nov 2015 Marcin Moczulski, Kelvin Xu, Aaron Courville, Kyunghyun Cho

Recently there has been growing interest in building active visual object recognizers, as opposed to the usual passive recognizers which classifies a given static image into a predefined set of object categories.

ACDC: A Structured Efficient Linear Layer

2 code implementations18 Nov 2015 Marcin Moczulski, Misha Denil, Jeremy Appleyard, Nando de Freitas

Finally, this paper also provides a connection between structured linear transforms used in deep learning and the field of Fourier optics, illustrating how ACDC could in principle be implemented with lenses and diffractive elements.

ADASECANT: Robust Adaptive Secant Method for Stochastic Gradient

no code implementations23 Dec 2014 Caglar Gulcehre, Marcin Moczulski, Yoshua Bengio

The convergence of SGD depends on the careful choice of learning rate and the amount of the noise in stochastic estimates of the gradients.

Deep Fried Convnets

1 code implementation ICCV 2015 Zichao Yang, Marcin Moczulski, Misha Denil, Nando de Freitas, Alex Smola, Le Song, Ziyu Wang

The fully connected layers of a deep convolutional neural network typically contain over 90% of the network parameters, and consume the majority of the memory required to store the network parameters.

Image Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.