Search Results for author: Si Yi Meng

Found 7 papers, 2 papers with code

Gradient Descent on Logistic Regression with Non-Separable Data and Large Step Sizes

no code implementations7 Jun 2024 Si Yi Meng, Antonio Orvieto, Daniel Yiming Cao, Christopher De Sa

In one dimension, we show that a step size less than $1/\lambda$ suffices for global convergence.

A Model-Based Method for Minimizing CVaR and Beyond

no code implementations27 May 2023 Si Yi Meng, Robert M. Gower

We develop a variant of the stochastic prox-linear method for minimizing the Conditional Value-at-Risk (CVaR) objective.

A General Analysis of Example-Selection for Stochastic Gradient Descent

no code implementations ICLR 2022 Yucheng Lu, Si Yi Meng, Christopher De Sa

In this paper, we develop a broad condition on the sequence of examples used by SGD that is sufficient to prove tight convergence rates in both strongly convex and non-convex settings.

Data Augmentation

Adaptive Gradient Methods Converge Faster with Over-Parameterization (and you can do a line-search)

no code implementations28 Sep 2020 Sharan Vaswani, Issam H. Laradji, Frederik Kunstner, Si Yi Meng, Mark Schmidt, Simon Lacoste-Julien

Under an interpolation assumption, we prove that AMSGrad with a constant step-size and momentum can converge to the minimizer at the faster $O(1/T)$ rate for smooth, convex functions.

Binary Classification

OptiBox: Breaking the Limits of Proposals for Visual Grounding

no code implementations29 Nov 2019 Zicong Fan, Si Yi Meng, Leonid Sigal, James J. Little

The problem of language grounding has attracted much attention in recent years due to its pivotal role in more general image-lingual high level reasoning tasks (e. g., image captioning, VQA).

Image Captioning Visual Grounding +1

Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation

1 code implementation11 Oct 2019 Si Yi Meng, Sharan Vaswani, Issam Laradji, Mark Schmidt, Simon Lacoste-Julien

Under this condition, we show that the regularized subsampled Newton method (R-SSN) achieves global linear convergence with an adaptive step-size and a constant batch-size.

Binary Classification Second-order methods

Cannot find the paper you are looking for? You can Submit a new open access paper.