Search Results for author: Jun-Kun Wang

Found 17 papers, 0 papers with code

On Frank-Wolfe and Equilibrium Computation

no code implementations NeurIPS 2017 Jacob D. Abernethy, Jun-Kun Wang

We consider the Frank-Wolfe (FW) method for constrained convex optimization, and we show that this classical technique can be interpreted from a different perspective: FW emerges as the computation of an equilibrium (saddle point) of a special convex-concave zero sum game.

Faster Rates for Convex-Concave Games

no code implementations17 May 2018 Jacob Abernethy, Kevin A. Lai, Kfir. Y. Levy, Jun-Kun Wang

We consider the use of no-regret algorithms to compute equilibria for particular classes of convex-concave games.

Acceleration through Optimistic No-Regret Dynamics

no code implementations NeurIPS 2018 Jun-Kun Wang, Jacob Abernethy

In this paper we show that the technique can be enhanced to a rate of $O(1/T^2)$ by extending recent work \cite{RS13, SALS15} that leverages \textit{optimistic learning} to speed up equilibrium computation.

Revisiting Projection-Free Optimization for Strongly Convex Constraint Sets

no code implementations14 Nov 2018 Jarrid Rector-Brooks, Jun-Kun Wang, Barzan Mozafari

We also show that, for the general case of (smooth) non-convex functions, FW with line search converges with high probability to a stationary point at a rate of $O\left(\frac{1}{t}\right)$, as long as the constraint set is strongly convex -- one of the fastest convergence rates in non-convex optimization.

An Optimistic Acceleration of AMSGrad for Nonconvex Optimization

no code implementations ICLR 2020 Jun-Kun Wang, Xiaoyun Li, Belhal Karimi, Ping Li

We propose a new variant of AMSGrad, a popular adaptive gradient based optimization algorithm widely used for training deep neural networks.

Zeroth Order Optimization by a Mixture of Evolution Strategies

no code implementations25 Sep 2019 Jun-Kun Wang, Xiaoyun Li, Ping Li

Perhaps the only methods that enjoy convergence guarantees are the ones that sample the perturbed points uniformly from a unit sphere or from a multivariate Gaussian distribution with an isotropic covariance.

A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear Network

no code implementations4 Oct 2020 Jun-Kun Wang, Chi-Heng Lin, Jacob Abernethy

Our result shows that with the appropriate choice of parameters Polyak's momentum has a rate of $(1-\Theta(\frac{1}{\sqrt{\kappa'}}))^t$.

Quickly Finding a Benign Region via Heavy Ball Momentum in Non-Convex Optimization

no code implementations4 Oct 2020 Jun-Kun Wang, Jacob Abernethy

The Heavy Ball Method, proposed by Polyak over five decades ago, is a first-order method for optimizing continuous functions.

Retrieval

Escaping Saddle Points Faster with Stochastic Momentum

no code implementations ICLR 2020 Jun-Kun Wang, Chi-Heng Lin, Jacob Abernethy

At the same time, a widely-observed empirical phenomenon is that in training deep networks stochastic momentum appears to significantly improve convergence time, variants of it have flourished in the development of other popular update methods, e. g. ADAM [KB15], AMSGrad [RKK18], etc.

Open-Ended Question Answering Stochastic Optimization

Understanding Modern Techniques in Optimization: Frank-Wolfe, Nesterov's Momentum, and Polyak's Momentum

no code implementations23 Jun 2021 Jun-Kun Wang

In the first part of this dissertation research, we develop a modular framework that can serve as a recipe for constructing and analyzing iterative algorithms for convex optimization.

No-Regret Dynamics in the Fenchel Game: A Unified Framework for Algorithmic Convex Optimization

no code implementations22 Nov 2021 Jun-Kun Wang, Jacob Abernethy, Kfir Y. Levy

We develop an algorithmic framework for solving convex optimization problems using no-regret game dynamics.

Provable Acceleration of Heavy Ball beyond Quadratics for a Class of Polyak-Łojasiewicz Functions when the Non-Convexity is Averaged-Out

no code implementations22 Jun 2022 Jun-Kun Wang, Chi-Heng Lin, Andre Wibisono, Bin Hu

An additional condition needs to be satisfied for the acceleration result of HB beyond quadratics in this work, which naturally holds when the dimension is one or, more broadly, when the Hessian is diagonal.

Accelerating Hamiltonian Monte Carlo via Chebyshev Integration Time

no code implementations5 Jul 2022 Jun-Kun Wang, Andre Wibisono

When the potential $f$ is $L$-smooth and $m$-strongly convex, i. e.\ for sampling from a log-smooth and strongly log-concave target distribution $\pi$, it is known that under a constant integration time, the number of iterations that ideal HMC takes to get an $\epsilon$ Wasserstein-2 distance to the target $\pi$ is $O( \kappa \log \frac{1}{\epsilon} )$, where $\kappa := \frac{L}{m}$ is the condition number.

Towards Understanding GD with Hard and Conjugate Pseudo-labels for Test-Time Adaptation

no code implementations18 Oct 2022 Jun-Kun Wang, Andre Wibisono

We consider a setting that a model needs to adapt to a new domain under distribution shifts, given that only unlabeled test samples from the new domain are accessible at test time.

Binary Classification Test-time Adaptation

Continuized Acceleration for Quasar Convex Functions in Non-Convex Optimization

no code implementations15 Feb 2023 Jun-Kun Wang, Andre Wibisono

Quasar convexity is a condition that allows some first-order methods to efficiently minimize a function even when the optimization landscape is non-convex.

Cannot find the paper you are looking for? You can Submit a new open access paper.