Search Results for author: Grzegorz Swirszcz

Found 10 papers, 3 papers with code

Stepping on the Edge: Curvature Aware Learning Rate Tuners

no code implementations8 Jul 2024 Vincent Roulet, Atish Agarwala, Jean-bastien Grill, Grzegorz Swirszcz, Mathieu Blondel, Fabian Pedregosa

These models break the stabilization of the sharpness, which we explain using a simplified model of the joint dynamics of the learning rate and the curvature.

Rapid training of deep neural networks without skip connections or normalization layers using Deep Kernel Shaping

2 code implementations5 Oct 2021 James Martens, Andy Ballard, Guillaume Desjardins, Grzegorz Swirszcz, Valentin Dalibard, Jascha Sohl-Dickstein, Samuel S. Schoenholz

Using an extended and formalized version of the Q/C map analysis of Poole et al. (2016), along with Neural Tangent Kernel theory, we identify the main pathologies present in deep networks that prevent them from training fast and generalizing to unseen data, and show how these can be avoided by carefully controlling the "shape" of the network's initialization-time kernel function.

Verification of Non-Linear Specifications for Neural Networks

no code implementations ICLR 2019 Chongli Qin, Krishnamurthy, Dvijotham, Brendan O'Donoghue, Rudy Bunel, Robert Stanforth, Sven Gowal, Jonathan Uesato, Grzegorz Swirszcz, Pushmeet Kohli

We show that a number of important properties of interest can be modeled within this class, including conservation of energy in a learned dynamics model of a physical system; semantic consistency of a classifier's output labels under adversarial perturbations and bounding errors in a system that predicts the summation of handwritten digits.

Distilling Policy Distillation

no code implementations6 Feb 2019 Wojciech Marian Czarnecki, Razvan Pascanu, Simon Osindero, Siddhant M. Jayakumar, Grzegorz Swirszcz, Max Jaderberg

The transfer of knowledge from one policy to another is an important tool in Deep Reinforcement Learning.

Deep Reinforcement Learning

Strength in Numbers: Trading-off Robustness and Computation via Adversarially-Trained Ensembles

no code implementations ICLR 2019 Edward Grefenstette, Robert Stanforth, Brendan O'Donoghue, Jonathan Uesato, Grzegorz Swirszcz, Pushmeet Kohli

We show that increasing the number of parameters in adversarially-trained models increases their robustness, and in particular that ensembling smaller models while adversarially training the entire ensemble as a single model is a more efficient way of spending said budget than simply using a larger single model.

Self-Driving Cars

Local minima in training of neural networks

1 code implementation19 Nov 2016 Grzegorz Swirszcz, Wojciech Marian Czarnecki, Razvan Pascanu

Given that deep networks are highly nonlinear systems optimized by local gradient methods, why do they not seem to be affected by bad local minima?

Grouped Orthogonal Matching Pursuit for Variable Selection and Prediction

no code implementations NeurIPS 2009 Grzegorz Swirszcz, Naoki Abe, Aurelie C. Lozano

We consider the problem of variable group selection for least squares regression, namely, that of selecting groups of variables for best regression performance, leveraging and adhering to a natural grouping structure within the explanatory variables.

feature selection Prediction +2

Cannot find the paper you are looking for? You can Submit a new open access paper.