no code implementations • 25 Sep 2019 • Kin Gutierrez, Cristian Challu, Jin Li, Artur Dubrawski
Adaptive moment methods have been remarkably successful for optimization under the presence of high dimensional or sparse gradients, in parallel to this, adaptive sampling probabilities for SGD have allowed optimizers to improve convergence rates by prioritizing examples to learn efficiently.
no code implementations • 6 Nov 2018 • Kin Gutierrez, Jin Li, Cristian Challu, Artur Dubrawski
We observe that the benefits of~\textsc{DASGrad} increase with the model complexity and variability of the gradients, and we explore the resulting utility in extensions of distribution-matching multitask learning.