This work proposes a rapid global solver for nonconvex low-rank matrix factorization (MF) problems that we name MF-Global.
This paper proposes an algorithm (RMDA) for training neural networks (NNs) with a regularization term for promoting desired structures.
Stochastic gradient descent with momentum (SGD+M) is widely used to empirically improve the convergence behavior and the generalization performance of plain stochastic gradient descent (SGD) in the training of deep learning models, but our theoretical understanding for SGD+M is still very limited.
We show that for a wide class of degenerate solutions, ISQA+ possesses superlinear convergence not just only in iterations, but also in running time because the cost per iteration is bounded.
Optimization and Control
When applied to the distributed dual ERM problem, unlike state of the art that takes only the block-diagonal part of the Hessian, our approach is able to utilize global curvature information and is thus magnitudes faster.
Initial computational results on convex problems demonstrate that our method significantly improves on communication cost and running time over the current state-of-the-art methods.
In this document, we show that the algorithm CoCoA+ (Ma et al., ICML, 2015) under the setting used in their experiments, which is also the best setting suggested by the authors that proposed this algorithm, is equivalent to the practical variant of DisDCA (Yang, NIPS, 2013).