Search Results for author: Ilya Loshchilov

Found 15 papers, 7 papers with code

SGDR: Stochastic Gradient Descent with Warm Restarts

17 code implementations • 13 Aug 2016 • Ilya Loshchilov, Frank Hutter

Partial warm restarts are also gaining popularity in gradient-based optimization to improve the rate of convergence in accelerated gradient schemes to deal with ill-conditioned functions.

EEG Stochastic Optimization

29,713

Paper
Code

Online Batch Selection for Faster Training of Neural Networks

1 code implementation • 19 Nov 2015 • Ilya Loshchilov, Frank Hutter

We investigate online batch selection strategies for two state-of-the-art methods of stochastic gradient-based optimization, AdaDelta and Adam.

3,844

Paper
Code

Decoupled Weight Decay Regularization

20 code implementations • ICLR 2019 • Ilya Loshchilov, Frank Hutter

L$_2$ regularization and weight decay regularization are equivalent for standard stochastic gradient descent (when rescaled by the learning rate), but as we demonstrate this is \emph{not} the case for adaptive gradient algorithms, such as Adam.

Image Classification

1,580

Paper
Code

A Downsampled Variant of ImageNet as an Alternative to the CIFAR datasets

7 code implementations • 27 Jul 2017 • Patryk Chrabaszcz, Ilya Loshchilov, Frank Hutter

The original ImageNet dataset is a popular large-scale benchmark for training Deep Neural Networks.

Ranked #1 on Image Classification on ImageNet-32

Image Classification Neural Architecture Search

231

Paper
Code

Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari

1 code implementation • 24 Feb 2018 • Patryk Chrabaszcz, Ilya Loshchilov, Frank Hutter

Evolution Strategies (ES) have recently been demonstrated to be a viable alternative to reinforcement learning (RL) algorithms on a set of challenging deep RL problems, including Atari games and MuJoCo humanoid locomotion benchmarks.

Atari Games Benchmarking +1

Paper
Code

Self-Adaptive Surrogate-Assisted Covariance Matrix Adaptation Evolution Strategy

1 code implementation • 11 Apr 2012 • Ilya Loshchilov, Marc Schoenauer, Michèle Sebag

The resulting algorithm, saACM-ES, adjusts online the lifelength of the current surrogate model (the number of CMA-ES generations before learning a new surrogate) and the surrogate hyper-parameters.

Paper
Code

Limited-Memory Matrix Adaptation for Large Scale Black-box Optimization

2 code implementations • 18 May 2017 • Ilya Loshchilov, Tobias Glasmachers, Hans-Georg Beyer

The Covariance Matrix Adaptation Evolution Strategy (CMA-ES) is a popular method to deal with nonconvex and/or stochastic optimization problems when the gradient information is not available.

Stochastic Optimization

Paper
Code

Anytime Bi-Objective Optimization with a Hybrid Multi-Objective CMA-ES (HMO-CMA-ES)

no code implementations • 9 May 2016 • Ilya Loshchilov, Tobias Glasmachers

We propose a multi-objective optimization algorithm aimed at achieving good anytime performance over a wide range of problems.

Benchmarking

Paper
Add Code

CMA-ES for Hyperparameter Optimization of Deep Neural Networks

no code implementations • 25 Apr 2016 • Ilya Loshchilov, Frank Hutter

Hyperparameters of deep neural networks are often optimized by grid search, random search or Bayesian optimization.

Bayesian Optimization Hyperparameter Optimization

Paper
Add Code

LM-CMA: an Alternative to L-BFGS for Large Scale Black-box Optimization

no code implementations • 1 Nov 2015 • Ilya Loshchilov

Invariance properties of the algorithm do not prevent it from demonstrating a comparable performance to L-BFGS on non-trivial large scale smooth and nonsmooth optimization problems.

Paper
Add Code

Maximum Likelihood-based Online Adaptation of Hyper-parameters in CMA-ES

no code implementations • 10 Jun 2014 • Ilya Loshchilov, Marc Schoenauer, Michèle Sebag, Nikolaus Hansen

The Covariance Matrix Adaptation Evolution Strategy (CMA-ES) is widely accepted as a robust derivative-free continuous optimization algorithm for non-linear and non-convex optimization problems.

Paper
Add Code

A Computationally Efficient Limited Memory CMA-ES for Large Scale Optimization

no code implementations • 21 Apr 2014 • Ilya Loshchilov

We propose a computationally efficient limited memory Covariance Matrix Adaptation Evolution Strategy for large scale optimization, which we call the LM-CMA-ES.

Paper
Add Code

KL-based Control of the Learning Schedule for Surrogate Black-Box Optimization

no code implementations • 12 Aug 2013 • Ilya Loshchilov, Marc Schoenauer, Michèle Sebag

This weakness is commonly addressed through surrogate optimization, learning an estimate of the objective function a. k. a.

Paper
Add Code

Fixing Weight Decay Regularization in Adam

no code implementations • ICLR 2018 • Ilya Loshchilov, Frank Hutter

We note that common implementations of adaptive gradient algorithms, such as Adam, limit the potential benefit of weight decay regularization, because the weights do not decay multiplicatively (as would be expected for standard weight decay) but by an additive constant factor.

Image Classification

Paper
Add Code

Weight Norm Control

no code implementations • 19 Nov 2023 • Ilya Loshchilov

We note that decoupled weight decay regularization is a particular case of weight norm control where the target norm of weights is set to 0.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.