Search Results for author: Valentin Dalibard

Found 8 papers, 2 papers with code

Rapid training of deep neural networks without skip connections or normalization layers using Deep Kernel Shaping

1 code implementation5 Oct 2021 James Martens, Andy Ballard, Guillaume Desjardins, Grzegorz Swirszcz, Valentin Dalibard, Jascha Sohl-Dickstein, Samuel S. Schoenholz

Using an extended and formalized version of the Q/C map analysis of Poole et al. (2016), along with Neural Tangent Kernel theory, we identify the main pathologies present in deep networks that prevent them from training fast and generalizing to unseen data, and show how these can be avoided by carefully controlling the "shape" of the network's initialization-time kernel function.

Faster Improvement Rate Population Based Training

no code implementations28 Sep 2021 Valentin Dalibard, Max Jaderberg

Our experiments show that FIRE PBT is able to outperform PBT on the ImageNet benchmark and match the performance of networks that were trained with a hand-tuned learning rate schedule.

Perception-Prediction-Reaction Agents for Deep Reinforcement Learning

no code implementations26 Jun 2020 Adam Stooke, Valentin Dalibard, Siddhant M. Jayakumar, Wojciech M. Czarnecki, Max Jaderberg

We employ a temporal hierarchy, using a slow-ticking recurrent core to allow information to flow more easily over long time spans, and three fast-ticking recurrent cores with connections designed to create an information asymmetry.


A Generalized Framework for Population Based Training

no code implementations5 Feb 2019 Ang Li, Ola Spyra, Sagi Perel, Valentin Dalibard, Max Jaderberg, Chenjie Gu, David Budden, Tim Harley, Pramod Gupta

Population Based Training (PBT) is a recent approach that jointly optimizes neural network weights and hyperparameters which periodically copies weights of the best performers and mutates hyperparameters during training.

Population Based Training of Neural Networks

6 code implementations27 Nov 2017 Max Jaderberg, Valentin Dalibard, Simon Osindero, Wojciech M. Czarnecki, Jeff Donahue, Ali Razavi, Oriol Vinyals, Tim Green, Iain Dunning, Karen Simonyan, Chrisantha Fernando, Koray Kavukcuoglu

Neural networks dominate the modern machine learning landscape, but their training and success still suffer from sensitivity to empirical choices of hyperparameters such as model architecture, loss function, and optimisation algorithm.

Machine Translation Model Selection

Tuning the Scheduling of Distributed Stochastic Gradient Descent with Bayesian Optimization

no code implementations1 Dec 2016 Valentin Dalibard, Michael Schaarschmidt, Eiko Yoneki

We present an optimizer which uses Bayesian optimization to tune the system parameters of distributed stochastic gradient descent (SGD).

Learning Runtime Parameters in Computer Systems with Delayed Experience Injection

no code implementations31 Oct 2016 Michael Schaarschmidt, Felix Gessert, Valentin Dalibard, Eiko Yoneki

This paper investigates the use of deep reinforcement learning for runtime parameters of cloud databases under latency constraints.


Cannot find the paper you are looking for? You can Submit a new open access paper.