Search Results for author: Jun-Kun Wang

Found 17 papers, 0 papers with code

On Frank-Wolfe and Equilibrium Computation

no code implementations • NeurIPS 2017 • Jacob D. Abernethy, Jun-Kun Wang

We consider the Frank-Wolfe (FW) method for constrained convex optimization, and we show that this classical technique can be interpreted from a different perspective: FW emerges as the computation of an equilibrium (saddle point) of a special convex-concave zero sum game.

Paper
Add Code

Faster Rates for Convex-Concave Games

no code implementations • 17 May 2018 • Jacob Abernethy, Kevin A. Lai, Kfir. Y. Levy, Jun-Kun Wang

We consider the use of no-regret algorithms to compute equilibria for particular classes of convex-concave games.

Paper
Add Code

Acceleration through Optimistic No-Regret Dynamics

no code implementations • NeurIPS 2018 • Jun-Kun Wang, Jacob Abernethy

In this paper we show that the technique can be enhanced to a rate of $O(1/T^2)$ by extending recent work \cite{RS13, SALS15} that leverages \textit{optimistic learning} to speed up equilibrium computation.

Paper
Add Code

Revisiting Projection-Free Optimization for Strongly Convex Constraint Sets

no code implementations • 14 Nov 2018 • Jarrid Rector-Brooks, Jun-Kun Wang, Barzan Mozafari

We also show that, for the general case of (smooth) non-convex functions, FW with line search converges with high probability to a stationary point at a rate of $O\left(\frac{1}{t}\right)$, as long as the constraint set is strongly convex -- one of the fastest convergence rates in non-convex optimization.

Paper
Add Code

An Optimistic Acceleration of AMSGrad for Nonconvex Optimization

no code implementations • ICLR 2020 • Jun-Kun Wang, Xiaoyun Li, Belhal Karimi, Ping Li

We propose a new variant of AMSGrad, a popular adaptive gradient based optimization algorithm widely used for training deep neural networks.

Paper
Add Code

Optimistic Acceleration for Optimization

no code implementations • ICLR 2019 • Jun-Kun Wang, Xiaoyun Li, Ping Li

We consider new variants of optimization algorithms.

Paper
Add Code

Zeroth Order Optimization by a Mixture of Evolution Strategies

no code implementations • 25 Sep 2019 • Jun-Kun Wang, Xiaoyun Li, Ping Li

Perhaps the only methods that enjoy convergence guarantees are the ones that sample the perturbed points uniformly from a unit sphere or from a multivariate Gaussian distribution with an isotropic covariance.

Paper
Add Code

Understanding How Over-Parametrization Leads to Acceleration: A case of learning a single teacher neuron

no code implementations • 4 Oct 2020 • Jun-Kun Wang, Jacob Abernethy

Over-parametrization has become a popular technique in deep learning.

Paper
Add Code

A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear Network

no code implementations • 4 Oct 2020 • Jun-Kun Wang, Chi-Heng Lin, Jacob Abernethy

Our result shows that with the appropriate choice of parameters Polyak's momentum has a rate of $(1-\Theta(\frac{1}{\sqrt{\kappa'}}))^t$.

Paper
Add Code

Quickly Finding a Benign Region via Heavy Ball Momentum in Non-Convex Optimization

no code implementations • 4 Oct 2020 • Jun-Kun Wang, Jacob Abernethy

The Heavy Ball Method, proposed by Polyak over five decades ago, is a first-order method for optimizing continuous functions.

Retrieval

Paper
Add Code

Escaping Saddle Points Faster with Stochastic Momentum

no code implementations • ICLR 2020 • Jun-Kun Wang, Chi-Heng Lin, Jacob Abernethy

At the same time, a widely-observed empirical phenomenon is that in training deep networks stochastic momentum appears to significantly improve convergence time, variants of it have flourished in the development of other popular update methods, e. g. ADAM [KB15], AMSGrad [RKK18], etc.

Open-Ended Question Answering Stochastic Optimization

Paper
Add Code

Understanding Modern Techniques in Optimization: Frank-Wolfe, Nesterov's Momentum, and Polyak's Momentum

no code implementations • 23 Jun 2021 • Jun-Kun Wang

In the first part of this dissertation research, we develop a modular framework that can serve as a recipe for constructing and analyzing iterative algorithms for convex optimization.

Paper
Add Code

No-Regret Dynamics in the Fenchel Game: A Unified Framework for Algorithmic Convex Optimization

no code implementations • 22 Nov 2021 • Jun-Kun Wang, Jacob Abernethy, Kfir Y. Levy

We develop an algorithmic framework for solving convex optimization problems using no-regret game dynamics.

Paper
Add Code

Provable Acceleration of Heavy Ball beyond Quadratics for a Class of Polyak-Łojasiewicz Functions when the Non-Convexity is Averaged-Out

no code implementations • 22 Jun 2022 • Jun-Kun Wang, Chi-Heng Lin, Andre Wibisono, Bin Hu

An additional condition needs to be satisfied for the acceleration result of HB beyond quadratics in this work, which naturally holds when the dimension is one or, more broadly, when the Hessian is diagonal.

Paper
Add Code

Accelerating Hamiltonian Monte Carlo via Chebyshev Integration Time

no code implementations • 5 Jul 2022 • Jun-Kun Wang, Andre Wibisono

When the potential $f$ is $L$-smooth and $m$-strongly convex, i. e.\ for sampling from a log-smooth and strongly log-concave target distribution $\pi$, it is known that under a constant integration time, the number of iterations that ideal HMC takes to get an $\epsilon$ Wasserstein-2 distance to the target $\pi$ is $O( \kappa \log \frac{1}{\epsilon} )$, where $\kappa := \frac{L}{m}$ is the condition number.

Paper
Add Code

Towards Understanding GD with Hard and Conjugate Pseudo-labels for Test-Time Adaptation

no code implementations • 18 Oct 2022 • Jun-Kun Wang, Andre Wibisono

We consider a setting that a model needs to adapt to a new domain under distribution shifts, given that only unlabeled test samples from the new domain are accessible at test time.

Binary Classification Test-time Adaptation

Paper
Add Code

Continuized Acceleration for Quasar Convex Functions in Non-Convex Optimization

no code implementations • 15 Feb 2023 • Jun-Kun Wang, Andre Wibisono

Quasar convexity is a condition that allows some first-order methods to efficiently minimize a function even when the optimization landscape is non-convex.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.