no code implementations • 26 May 2023 • Ketaki Joshi, Raghavendra Pradyumna Pothukuchi, Andre Wibisono, Abhishek Bhattacharjee
Compared to state-of-the-art weight regularization methods to mitigate catastrophic forgetting, our approach is simple, effective, and enables faster learning.
no code implementations • 15 Feb 2023 • Jun-Kun Wang, Andre Wibisono
Quasar convexity is a condition that allows some first-order methods to efficiently minimize a function even when the optimization landscape is non-convex.
no code implementations • 2 Nov 2022 • Kaylee Yingxi Yang, Andre Wibisono
We study the Inexact Langevin Dynamics (ILD), Inexact Langevin Algorithm (ILA), and Score-based Generative Modeling (SGM) when utilizing estimated score functions for sampling.
no code implementations • 28 Oct 2022 • Ryan Yang, Haizhou Du, Andre Wibisono, Patrick Baker
Distributed machine learning (DML) can be an important capability for modern military to take advantage of data and devices distributed at multiple vantage points to adapt and learn.
no code implementations • 18 Oct 2022 • Jun-Kun Wang, Andre Wibisono
We consider a setting that a model needs to adapt to a new domain under distribution shifts, given that only unlabeled test samples from the new domain are accessible at test time.
no code implementations • 5 Jul 2022 • Jun-Kun Wang, Andre Wibisono
When the potential $f$ is $L$-smooth and $m$-strongly convex, i. e.\ for sampling from a log-smooth and strongly log-concave target distribution $\pi$, it is known that under a constant integration time, the number of iterations that ideal HMC takes to get an $\epsilon$ Wasserstein-2 distance to the target $\pi$ is $O( \kappa \log \frac{1}{\epsilon} )$, where $\kappa := \frac{L}{m}$ is the condition number.
no code implementations • 22 Jun 2022 • Jun-Kun Wang, Chi-Heng Lin, Andre Wibisono, Bin Hu
An additional condition needs to be satisfied for the acceleration result of HB beyond quadratics in this work, which naturally holds when the dimension is one or, more broadly, when the Hessian is diagonal.
no code implementations • 8 Jun 2022 • Andre Wibisono, Molei Tao, Georgios Piliouras
In this paper we study two-player bilinear zero-sum games with constrained strategy spaces.
no code implementations • 13 Feb 2022 • Yongxin Chen, Sinho Chewi, Adil Salim, Andre Wibisono
We study the proximal sampler of Lee, Shen, and Tian (2021) and obtain new convergence guarantees under weaker assumptions than strong log-concavity: namely, our results hold for (1) weakly log-concave targets, and (2) targets satisfying isoperimetric assumptions which allow for non-log-concavity.
no code implementations • 29 Jan 2022 • Haizhou Du, Ryan Yang, Yijian Chen, Qiao Xiang, Andre Wibisono, Wei Huang
In this paper, we analyze properties of the WPM and rigorously prove convergence properties of our aggregation mechanism.
no code implementations • 24 Sep 2021 • Ruilin Li, Molei Tao, Santosh S. Vempala, Andre Wibisono
The Mirror Langevin Diffusion (MLD) is a sampling analogue of mirror flow in continuous time, and it has nice convergence properties under log-Sobolev or Poincare inequalities relative to the Hessian metric, as shown by Chewi et al. (2020).
no code implementations • 4 Nov 2019 • Andre Wibisono
We study the Proximal Langevin Algorithm (PLA) for sampling from a probability distribution $\nu = e^{-f}$ on $\mathbb{R}^n$ under isoperimetry.
no code implementations • ICLR 2020 • Jacob Abernethy, Kevin A. Lai, Andre Wibisono
While classic work in convex-concave min-max optimization relies on average-iterate convergence results, the emergence of nonconvex applications such as training Generative Adversarial Networks has led to renewed interest in last-iterate convergence guarantees.
no code implementations • NeurIPS 2019 • Santosh S. Vempala, Andre Wibisono
We also prove convergence guarantees in R\'enyi divergence of order $q > 1$ assuming the limit of ULA satisfies either the log-Sobolev or Poincar\'e inequality.
1 code implementation • NeurIPS 2019 • Ashia Wilson, Lester Mackey, Andre Wibisono
We also introduce a new first-order algorithm, called rescaled gradient descent (RGD), and show that RGD achieves a faster convergence rate than gradient descent provided the function is strongly smooth -- a natural generalization of the standard smoothness assumption on the objective function.
Optimization and Control
no code implementations • 22 Feb 2018 • Andre Wibisono
We show SLA is in fact consistent for Gaussian target measure, whereas ULA is not.
no code implementations • 14 Mar 2016 • Andre Wibisono, Ashia C. Wilson, Michael. I. Jordan
We show that there is a Lagrangian functional that we call the \emph{Bregman Lagrangian} which generates a large class of accelerated methods in continuous time, including (but not limited to) accelerated gradient descent, its non-Euclidean extension, and accelerated higher-order gradient methods.
no code implementations • NeurIPS 2014 • Po-Ling Loh, Andre Wibisono
We establish sufficient conditions for the concavity of our reweighted objective function in terms of weight assignments in the Kikuchi expansion, and show that a reweighted version of the sum product algorithm applied to the Kikuchi region graph will produce global optima of the Kikuchi approximation whenever the algorithm converges.
no code implementations • 7 Dec 2013 • John C. Duchi, Michael. I. Jordan, Martin J. Wainwright, Andre Wibisono
We consider derivative-free algorithms for stochastic and non-stochastic convex optimization problems that use only function values rather than gradients.
no code implementations • NeurIPS 2013 • Jacob Abernethy, Peter L. Bartlett, Rafael Frongillo, Andre Wibisono
We consider a popular problem in finance, option pricing, through the lens of an online learning game between Nature and an Investor.
2 code implementations • NeurIPS 2013 • Tamara Broderick, Nicholas Boyd, Andre Wibisono, Ashia C. Wilson, Michael. I. Jordan
We present SDA-Bayes, a framework for (S)treaming, (D)istributed, (A)synchronous computation of a Bayesian posterior.
no code implementations • NeurIPS 2012 • Andre Wibisono, Martin J. Wainwright, Michael. I. Jordan, John C. Duchi
We consider derivative-free algorithms for stochastic optimization problems that use only noisy function values rather than gradients, analyzing their finite-sample convergence rates.