no code implementations • 7 Jun 2024 • Si Yi Meng, Antonio Orvieto, Daniel Yiming Cao, Christopher De Sa
In one dimension, we show that a step size less than $1/\lambda$ suffices for global convergence.
no code implementations • 27 May 2023 • Si Yi Meng, Robert M. Gower
We develop a variant of the stochastic prox-linear method for minimizing the Conditional Value-at-Risk (CVaR) objective.
no code implementations • ICLR 2022 • Yucheng Lu, Si Yi Meng, Christopher De Sa
In this paper, we develop a broad condition on the sequence of examples used by SGD that is sufficient to prove tight convergence rates in both strongly convex and non-convex settings.
no code implementations • 28 Sep 2020 • Sharan Vaswani, Issam H. Laradji, Frederik Kunstner, Si Yi Meng, Mark Schmidt, Simon Lacoste-Julien
Under an interpolation assumption, we prove that AMSGrad with a constant step-size and momentum can converge to the minimizer at the faster $O(1/T)$ rate for smooth, convex functions.
1 code implementation • 11 Jun 2020 • Sharan Vaswani, Issam Laradji, Frederik Kunstner, Si Yi Meng, Mark Schmidt, Simon Lacoste-Julien
In this setting, we prove that AMSGrad with constant step-size and momentum converges to the minimizer at a faster $O(1/T)$ rate.
no code implementations • 29 Nov 2019 • Zicong Fan, Si Yi Meng, Leonid Sigal, James J. Little
The problem of language grounding has attracted much attention in recent years due to its pivotal role in more general image-lingual high level reasoning tasks (e. g., image captioning, VQA).
1 code implementation • 11 Oct 2019 • Si Yi Meng, Sharan Vaswani, Issam Laradji, Mark Schmidt, Simon Lacoste-Julien
Under this condition, we show that the regularized subsampled Newton method (R-SSN) achieves global linear convergence with an adaptive step-size and a constant batch-size.