Search Results for author: Aaron Defazio

Found 24 papers, 11 papers with code

Stochastic Polyak Stepsize with a Moving Target

no code implementations22 Jun 2021 Robert M. Gower, Aaron Defazio, Michael Rabbat

MOTAPS can be seen as a variant of the Stochastic Polyak (SP) which is also a method that also uses loss values to adjust the stepsize.

Image Classification Translation

Dual Averaging is Surprisingly Effective for Deep Learning Optimization

no code implementations20 Oct 2020 Samy Jelassi, Aaron Defazio

First-order stochastic optimization methods are currently the most widely used class of methods for training deep neural networks.

Stochastic Optimization

Momentum via Primal Averaging: Theoretical Insights and Learning Rate Schedules for Non-Convex Optimization

1 code implementation1 Oct 2020 Aaron Defazio

Momentum methods are now used pervasively within the machine learning community for training non-convex models such as deep neural networks.

Almost sure convergence rates for Stochastic Gradient Descent and Stochastic Heavy Ball

no code implementations14 Jun 2020 Othmane Sebbouh, Robert M. Gower, Aaron Defazio

We show that these results still hold when using stochastic line search and stochastic Polyak stepsizes, thereby giving the first proof of convergence of these methods in the non-overparametrized regime.

The Power of Factorial Powers: New Parameter settings for (Stochastic) Optimization

no code implementations1 Jun 2020 Aaron Defazio, Robert M. Gower

The convergence rates for convex and non-convex optimization methods depend on the choice of a host of constants, including step sizes, Lyapunov function constants and momentum constants.

Stochastic Optimization

End-to-End Variational Networks for Accelerated MRI Reconstruction

3 code implementations14 Apr 2020 Anuroop Sriram, Jure Zbontar, Tullie Murrell, Aaron Defazio, C. Lawrence Zitnick, Nafissa Yakubova, Florian Knoll, Patricia Johnson

The slow acquisition speed of magnetic resonance imaging (MRI) has led to the development of two complementary methods: acquiring multiple views of the anatomy simultaneously (parallel imaging) and acquiring fewer samples than necessary for traditional signal processing methods (compressed sensing).

MRI Reconstruction

MRI Banding Removal via Adversarial Training

1 code implementation NeurIPS 2020 Aaron Defazio, Tullie Murrell, Michael P. Recht

MRI images reconstructed from sub-sampled Cartesian data using deep learning techniques often show a characteristic banding (sometimes described as streaking), which is particularly strong in low signal-to-noise regions of the reconstructed image.

Advancing machine learning for MR image reconstruction with an open competition: Overview of the 2019 fastMRI challenge

1 code implementation6 Jan 2020 Florian Knoll, Tullie Murrell, Anuroop Sriram, Nafissa Yakubova, Jure Zbontar, Michael Rabbat, Aaron Defazio, Matthew J. Muckley, Daniel K. Sodickson, C. Lawrence Zitnick, Michael P. Recht

Conclusion: The challenge led to new developments in machine learning for image reconstruction, provided insight into the current state of the art in the field, and highlighted remaining hurdles for clinical adoption.

Image Reconstruction

Scaling Laws for the Principled Design, Initialization, and Preconditioning of ReLU Networks

no code implementations ICLR 2020 Aaron Defazio, Leon Bottou

Abstract In this work, we describe a set of rules for the design and initialization of well-conditioned neural networks, guided by the goal of naturally balancing the diagonal blocks of the Hessian at the start of training.

Offset Sampling Improves Deep Learning based Accelerated MRI Reconstructions by Exploiting Symmetry

1 code implementation2 Dec 2019 Aaron Defazio

Deep learning approaches to accelerated MRI take a matrix of sampled Fourier-space lines as input and produce a spatial image as output.

GrappaNet: Combining Parallel Imaging with Deep Learning for Multi-Coil MRI Reconstruction

1 code implementation CVPR 2020 Anuroop Sriram, Jure Zbontar, Tullie Murrell, C. Lawrence Zitnick, Aaron Defazio, Daniel K. Sodickson

In this paper, we present a novel method to integrate traditional parallel imaging methods into deep neural networks that is able to generate high quality reconstructions even for high acceleration factors.

MRI Reconstruction

Beyond Folklore: A Scaling Calculus for the Design and Initialization of ReLU Networks

no code implementations10 Jun 2019 Aaron Defazio, Léon Bottou

We propose a system for calculating a "scaling constant" for layers and weights of neural networks.


no code implementations ICLR 2019 Aaron Defazio

We introduce a new normalization technique that exhibits the fast convergence properties of batch normalization using a transformation of layer weights instead of layer outputs.

On the Ineffectiveness of Variance Reduced Optimization for Deep Learning

1 code implementation ICLR 2019 Aaron Defazio, Léon Bottou

The applicability of these techniques to the hard non-convex optimization problems encountered during training of modern deep neural networks is an open problem.

Controlling Covariate Shift using Balanced Normalization of Weights

no code implementations ICLR 2019 Aaron Defazio, Léon Bottou

We introduce a new normalization technique that exhibits the fast convergence properties of batch normalization using a transformation of layer weights instead of layer outputs.

On the Curved Geometry of Accelerated Optimization

no code implementations NeurIPS 2019 Aaron Defazio

In this work we propose a differential geometric motivation for Nesterov's accelerated gradient method (AGM) for strongly-convex problems.

A Simple Practical Accelerated Method for Finite Sums

1 code implementation NeurIPS 2016 Aaron Defazio

We describe a novel optimization method for finite sums (such as empirical risk minimization problems) building on the recently introduced SAGA method.

New Optimisation Methods for Machine Learning

no code implementations9 Oct 2015 Aaron Defazio

For problems where the structure is known but the parameters unknown, we introduce an approximate maximum likelihood learning algorithm that is capable of learning a useful subclass of Gaussian graphical models.

online learning

SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives

5 code implementations NeurIPS 2014 Aaron Defazio, Francis Bach, Simon Lacoste-Julien

In this work we introduce a new optimisation method called SAGA in the spirit of SAG, SDCA, MISO and SVRG, a set of recently proposed incremental gradient algorithms with fast linear convergence rates.

A Convex Formulation for Learning Scale-Free Networks via Submodular Relaxation

no code implementations NeurIPS 2012 Aaron Defazio, Tibério S. Caetano

We consider the case where the structure of the graph to be reconstructed is known to be scale-free.

Cannot find the paper you are looking for? You can Submit a new open access paper.