29 papers with code • 0 benchmarks • 0 datasets
Use second-order statistics to process data.
These leaderboards are used to track progress in Second-order methods
Most implemented papers
Second-Order Stochastic Optimization for Machine Learning in Linear Time
First-order stochastic methods are the state-of-the-art in large-scale machine learning optimization owing to efficient per-iteration complexity.
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
We introduce ADAHESSIAN, a second order stochastic optimization algorithm which dynamically incorporates the curvature of the loss function via ADAptive estimates of the HESSIAN.
Newtonian Monte Carlo: single-site MCMC meets second-order gradient methods
NMC is similar to the Newton-Raphson update in optimization where the second order gradient is used to automatically scale the step size in each dimension.
Low Rank Saddle Free Newton: A Scalable Method for Stochastic Nonconvex Optimization
In this work we motivate the extension of Newton methods to the SA regime, and argue for the use of the scalable low rank saddle free Newton (LRSFN) method, which avoids forming the Hessian in favor of making a low rank approximation.
Near out-of-distribution detection for low-resolution radar micro-Doppler signatures
We emphasize the relevance of OODD and its specific supervision requirements for the detection of a multimodal, diverse targets class among other similar radar targets and clutter in real-life critical systems.
Optimization Methods for Supervised Machine Learning: From Linear Models to Deep Learning
We then discuss some of the distinctive features of these optimization problems, focusing on the examples of logistic regression and the training of deep neural networks.
Online Second Order Methods for Non-Convex Stochastic Optimizations
This paper proposes a family of online second order methods for possibly non-convex stochastic optimizations based on the theory of preconditioned stochastic gradient descent (PSGD), which can be regarded as an enhance stochastic Newton method with the ability to handle gradient noise and non-convexity simultaneously.
Large batch size training of neural networks with adversarial training and second-order information
Our method exceeds the performance of existing solutions in terms of both accuracy and the number of SGD iterations (up to 1\% and $5\times$, respectively).
Stochastic Trust Region Inexact Newton Method for Large-scale Machine Learning
Nowadays stochastic approximation methods are one of the major research direction to deal with the large-scale machine learning problems.
Improving SGD convergence by online linear regression of gradients in multiple statistically relevant directions
Deep neural networks are usually trained with stochastic gradient descent (SGD), which minimizes objective function using very rough approximations of gradient, only averaging to the real gradient.