Search Results for author: Wotao Yin

Found 72 papers, 25 papers with code

Lower Bounds and Nearly Optimal Algorithms in Distributed Learning with Communication Compression

no code implementations8 Jun 2022 Xinmeng Huang, Yiming Chen, Wotao Yin, Kun Yuan

We establish a convergence lower bound for algorithms whether using unbiased or contractive compressors in unidirection or bidirection.

Distributed Optimization

FiLM: Frequency improved Legendre Memory Model for Long-term Time Series Forecasting

no code implementations18 May 2022 Tian Zhou, Ziqing Ma, Xue Wang, Qingsong Wen, Liang Sun, Tao Yao, Wotao Yin, Rong Jin

Recent studies have shown that deep learning models such as RNNs and Transformers have brought significant performance gains for long-term forecasting of time series because they effectively utilize historical information.

Dimensionality Reduction Time Series Forecasting

A Novel Convergence Analysis for Algorithms of the Adam Family

no code implementations7 Dec 2021 Zhishuai Guo, Yi Xu, Wotao Yin, Rong Jin, Tianbao Yang

Although rigorous convergence analysis exists for Adam, they impose specific requirements on the update of the adaptive step size, which are not generic enough to cover many other variants of Adam.

Bilevel Optimization

An Improved Analysis and Rates for Variance Reduction under Without-replacement Sampling Orders

no code implementations NeurIPS 2021 Xinmeng Huang, Kun Yuan, Xianghui Mao, Wotao Yin

In this paper, we will improve the convergence analysis and rates of variance reduction under without-replacement sampling orders for composite finite-sum minimization. Our results are in two-folds.

Closing the Gap: Tighter Analysis of Alternating Stochastic Gradient Methods for Bilevel Problems

no code implementations NeurIPS 2021 Tianyi Chen, Yuejiao Sun, Wotao Yin

By leveraging the hidden smoothness of the problem, this paper presents a tighter analysis of ALSET for stochastic nested problems.

Bilevel Optimization

BlueFog: Make Decentralized Algorithms Practical for Optimization and Deep Learning

2 code implementations8 Nov 2021 Bicheng Ying, Kun Yuan, Hanbin Hu, Yiming Chen, Wotao Yin

On mainstream DNN training tasks, BlueFog reaches a much higher throughput and achieves an overall $1. 2\times \sim 1. 8\times$ speedup over Horovod, a state-of-the-art distributed deep learning package based on Ring-Allreduce.

Hyperparameter Tuning is All You Need for LISTA

1 code implementation NeurIPS 2021 Xiaohan Chen, Jialin Liu, Zhangyang Wang, Wotao Yin

Learned Iterative Shrinkage-Thresholding Algorithm (LISTA) introduces the concept of unrolling an iterative algorithm and training it like a neural network.

Exponential Graph is Provably Efficient for Decentralized Deep Training

2 code implementations NeurIPS 2021 Bicheng Ying, Kun Yuan, Yiming Chen, Hanbin Hu, Pan Pan, Wotao Yin

Experimental results on a variety of tasks and models demonstrate that decentralized (momentum) SGD over exponential graphs promises both fast and high-quality training.

Learned Robust PCA: A Scalable Deep Unfolding Approach for High-Dimensional Outlier Detection

1 code implementation NeurIPS 2021 HanQin Cai, Jialin Liu, Wotao Yin

Robust principal component analysis (RPCA) is a critical tool in modern machine learning, which detects outliers in the task of low-rank matrix reconstruction.

Outlier Detection

Communicate Then Adapt: An Effective Decentralized Adaptive Method for Deep Training

no code implementations29 Sep 2021 Bicheng Ying, Kun Yuan, Yiming Chen, Hanbin Hu, Yingya Zhang, Pan Pan, Wotao Yin

Decentralized adaptive gradient methods, in which each node averages only with its neighbors, are critical to save communication and wall-clock training time in deep learning tasks.

Tighter Analysis of Alternating Stochastic Gradient Method for Stochastic Nested Problems

no code implementations25 Jun 2021 Tianyi Chen, Yuejiao Sun, Wotao Yin

By leveraging the hidden smoothness of the problem, this paper presents a tighter analysis of ALSET for stochastic nested problems.

Bilevel Optimization

Learn to Predict Equilibria via Fixed Point Networks

no code implementations2 Jun 2021 Howard Heaton, Daniel Mckenzie, Qiuwei Li, Samy Wu Fung, Stanley Osher, Wotao Yin

This work introduces Nash Fixed Point Networks (N-FPNs), a class of implicit neural networks that learn to predict the equilibria given only the context.

A Novel Convergence Analysis for Algorithms of the Adam Family and Beyond

no code implementations30 Apr 2021 Zhishuai Guo, Yi Xu, Wotao Yin, Rong Jin, Tianbao Yang

Our analysis exhibits that an increasing or large enough "momentum" parameter for the first-order moment used in practice is sufficient to ensure Adam and its many variants converge under a mild boundness condition on the adaptive scaling factor of the step size.

Bilevel Optimization

Feasibility-based Fixed Point Networks

1 code implementation29 Apr 2021 Howard Heaton, Samy Wu Fung, Aviv Gibali, Wotao Yin

This is accomplished using feasibility-based fixed point networks (F-FPNs).

Improved Analysis and Rates for Variance Reduction under Without-replacement Sampling Orders

no code implementations25 Apr 2021 Xinmeng Huang, Kun Yuan, Xianghui Mao, Wotao Yin

In the highly data-heterogeneous scenario, Prox-DFinito with optimal cyclic sampling can attain a sample-size-independent convergence rate, which, to our knowledge, is the first result that can match with uniform-iid-sampling with variance reduction.

DecentLaM: Decentralized Momentum SGD for Large-batch Deep Training

1 code implementation ICCV 2021 Kun Yuan, Yiming Chen, Xinmeng Huang, Yingya Zhang, Pan Pan, Yinghui Xu, Wotao Yin

Experimental results on a variety of computer vision tasks and models demonstrate that DecentLaM promises both efficient and high-quality training.

Learning to Optimize: A Primer and A Benchmark

1 code implementation23 Mar 2021 Tianlong Chen, Xiaohan Chen, Wuyang Chen, Howard Heaton, Jialin Liu, Zhangyang Wang, Wotao Yin

It automates the design of an optimization method based on its performance on a set of training problems.

JFB: Jacobian-Free Backpropagation for Implicit Networks

2 code implementations23 Mar 2021 Samy Wu Fung, Howard Heaton, Qiuwei Li, Daniel Mckenzie, Stanley Osher, Wotao Yin

Unlike traditional networks, implicit networks solve a fixed point equation to compute inferences.

Provably Correct Optimization and Exploration with Non-linear Policies

1 code implementation22 Mar 2021 Fei Feng, Wotao Yin, Alekh Agarwal, Lin F. Yang

Policy optimization methods remain a powerful workhorse in empirical Reinforcement Learning (RL), with a focus on neural policies that can easily reason over complex and continuous state and/or action spaces.

A Zeroth-Order Block Coordinate Descent Algorithm for Huge-Scale Black-Box Optimization

1 code implementation21 Feb 2021 HanQin Cai, Yuchen Lou, Daniel Mckenzie, Wotao Yin

We consider the zeroth-order optimization problem in the huge-scale setting, where the dimension of the problem is so large that performing even basic vector operations on the decision variables is infeasible.

A Single-Timescale Method for Stochastic Bilevel Optimization

no code implementations9 Feb 2021 Tianyi Chen, Yuejiao Sun, Quan Xiao, Wotao Yin

This paper develops a new optimization method for a class of stochastic bilevel problems that we term Single-Timescale stochAstic BiLevEl optimization (STABLE) method.

Bilevel Optimization Meta-Learning +1

Moreau Envelope Augmented Lagrangian Method for Nonconvex Optimization with Linear Constraints

no code implementations21 Jan 2021 Jinshan Zeng, Wotao Yin, Ding-Xuan Zhou

We modify ALM to use a Moreau envelope of the augmented Lagrangian and establish its convergence under conditions that are weaker than those in the literature.

Optimization and Control

Learning A Minimax Optimizer: A Pilot Study

no code implementations ICLR 2021 Jiayi Shen, Xiaohan Chen, Howard Heaton, Tianlong Chen, Jialin Liu, Wotao Yin, Zhangyang Wang

We first present Twin L2O, the first dedicated minimax L2O framework consisting of two LSTMs for updating min and max variables, respectively.

CADA: Communication-Adaptive Distributed Adam

1 code implementation31 Dec 2020 Tianyi Chen, Ziye Guo, Yuejiao Sun, Wotao Yin

This paper proposes an adaptive stochastic gradient descent method for distributed machine learning, which can be viewed as the communication-adaptive counterpart of the celebrated Adam method - justifying its name CADA.

Machine Learning

Hybrid Federated Learning: Algorithms and Implementation

no code implementations22 Dec 2020 Xinwei Zhang, Wotao Yin, Mingyi Hong, Tianyi Chen

To the best of our knowledge, this is the first formulation and algorithm developed for the hybrid FL.

Federated Learning

Attentional Biased Stochastic Gradient for Imbalanced Classification

no code implementations13 Dec 2020 Qi Qi, Yi Xu, Rong Jin, Wotao Yin, Tianbao Yang

In this paper, we present a simple yet effective method (ABSGD) for addressing the data imbalance issue in deep learning.

Classification General Classification +2

An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient Methods

no code implementations NeurIPS 2020 Yanli Liu, Kaiqing Zhang, Tamer Basar, Wotao Yin

In this paper, we revisit and improve the convergence of policy gradient (PG), natural PG (NPG) methods, and their variance-reduced variants, under general smooth policy parametrizations.

Policy Gradient Methods

A One-bit, Comparison-Based Gradient Estimator

no code implementations6 Oct 2020 HanQin Cai, Daniel Mckenzie, Wotao Yin, Zhenliang Zhang

By treating the gradient as an unknown signal to be recovered, we show how one can use tools from one-bit compressed sensing to construct a robust and reliable estimator of the normalized gradient.

Solving Stochastic Compositional Optimization is Nearly as Easy as Solving Stochastic Optimization

no code implementations25 Aug 2020 Tianyi Chen, Yuejiao Sun, Wotao Yin

In particular, we apply Adam to SCSC, and the exhibited rate of convergence matches that of the original Adam on non-compositional stochastic optimization.

Management Meta-Learning +1

Wasserstein-based Projections with Applications to Inverse Problems

2 code implementations5 Aug 2020 Howard Heaton, Samy Wu Fung, Alex Tong Lin, Stanley Osher, Wotao Yin

To bridge this gap, we present a new algorithm that takes samples from the manifold of true data as input and outputs an approximation of the projection operator onto this manifold.

An Improved Analysis of Stochastic Gradient Descent with Momentum

1 code implementation NeurIPS 2020 Yanli Liu, Yuan Gao, Wotao Yin

Furthermore, the role of dynamic parameters has not been addressed.

Optimization and Control

VAFL: a Method of Vertical Asynchronous Federated Learning

no code implementations12 Jul 2020 Tianyi Chen, Xiao Jin, Yuejiao Sun, Wotao Yin

Horizontal Federated learning (FL) handles multi-client data that share the same set of features, and vertical FL trains a better predictor that combine all the features from different clients.

Federated Learning

FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity to Non-IID Data

1 code implementation22 May 2020 Xinwei Zhang, Mingyi Hong, Sairaj Dhople, Wotao Yin, Yang Liu

Aiming at designing FL algorithms that are provably fast and require as few assumptions as possible, we propose a new algorithm design strategy from the primal-dual optimization perspective.

Federated Learning

Zeroth-Order Regularized Optimization (ZORO): Approximately Sparse Gradients and Adaptive Sampling

1 code implementation29 Mar 2020 HanQin Cai, Daniel Mckenzie, Wotao Yin, Zhenliang Zhang

We consider the problem of minimizing a high-dimensional objective function, which may include a regularization term, using (possibly noisy) evaluations of the function.

Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning

1 code implementation NeurIPS 2020 Fei Feng, Ruosong Wang, Wotao Yin, Simon S. Du, Lin F. Yang

Motivated by the prevailing paradigm of using unsupervised learning for efficient exploration in reinforcement learning (RL) problems [tang2017exploration, bellemare2016unifying], we investigate when this paradigm is provably efficient.

Efficient Exploration reinforcement-learning

Safeguarded Learned Convex Optimization

no code implementations4 Mar 2020 Howard Heaton, Xiaohan Chen, Zhangyang Wang, Wotao Yin

Many applications require repeatedly solving a certain type of optimization problem, each time with new (but similar) data.

LASG: Lazily Aggregated Stochastic Gradients for Communication-Efficient Distributed Learning

1 code implementation26 Feb 2020 Tianyi Chen, Yuejiao Sun, Wotao Yin

The new algorithms adaptively choose between fresh and stale stochastic gradients and have convergence rates comparable to the original SGD.

Federated Learning

How Does an Approximate Model Help in Reinforcement Learning?

no code implementations6 Dec 2019 Fei Feng, Wotao Yin, Lin F. Yang

In particular, we provide an algorithm that uses $\widetilde{O}(N/(1-\gamma)^3/\varepsilon^2)$ samples in a generative model to learn an $\varepsilon$-optimal policy, where $\gamma$ is the discount factor and $N$ is the number of near-optimal actions in the approximate model.

reinforcement-learning Transfer Reinforcement Learning

XPipe: Efficient Pipeline Model Parallelism for Multi-GPU DNN Training

no code implementations24 Oct 2019 Lei Guan, Wotao Yin, Dongsheng Li, Xicheng Lu

It allows the overlapping of the pipelines of multiple micro-batches, including those belonging to different mini-batches.

Universal Safeguarded Learned Convex Optimization with Guaranteed Convergence

no code implementations25 Sep 2019 Howard Heaton, Xiaohan Chen, Zhangyang Wang, Wotao Yin

Inferences by each network form solution estimates, and networks are trained to optimize these estimates for a particular distribution of data.

ODE Analysis of Stochastic Gradient Methods with Optimism and Anchoring for Minimax Problems and GANs

no code implementations25 Sep 2019 Ernest K. Ryu, Kun Yuan, Wotao Yin

Despite remarkable empirical success, the training dynamics of generative adversarial networks (GAN), which involves solving a minimax game using stochastic gradients, is still poorly understood.

ODE Analysis of Stochastic Gradient Methods with Optimism and Anchoring for Minimax Problems

no code implementations26 May 2019 Ernest K. Ryu, Kun Yuan, Wotao Yin

Despite remarkable empirical success, the training dynamics of generative adversarial networks (GAN), which involves solving a minimax game using stochastic gradients, is still poorly understood.

Plug-and-Play Methods Provably Converge with Properly Trained Denoisers

1 code implementation14 May 2019 Ernest K. Ryu, Jialin Liu, Sicheng Wang, Xiaohan Chen, Zhangyang Wang, Wotao Yin

Plug-and-play (PnP) is a non-convex framework that integrates modern denoising priors, such as BM3D or deep learning-based denoisers, into ADMM or other proximal algorithms.

Denoising

A2BCD: Asynchronous Acceleration with Optimal Complexity

no code implementations ICLR 2019 Robert Hannah, Fei Feng, Wotao Yin

In this paper, we propose the Asynchronous Accelerated Nonuniform Randomized Block Coordinate Descent algorithm (A2BCD).

ALISTA: Analytic Weights Are As Good As Learned Weights in LISTA

no code implementations ICLR 2019 Jialin Liu, Xiaohan Chen, Zhangyang Wang, Wotao Yin

In this work, we propose Analytic LISTA (ALISTA), where the weight matrix in LISTA is computed as the solution to a data-free optimization problem, leaving only the stepsize and threshold parameters to data-driven learning.

AsyncQVI: Asynchronous-Parallel Q-Value Iteration for Discounted Markov Decision Processes with Near-Optimal Sample Complexity

1 code implementation3 Dec 2018 Yibo Zeng, Fei Feng, Wotao Yin

In this paper, we propose AsyncQVI, an asynchronous-parallel Q-value iteration for discounted Markov decision processes whose transition and reward can only be sampled through a generative model.

Markov Chain Block Coordinate Descent

no code implementations22 Nov 2018 Tao Sun, Yuejiao Sun, Yangyang Xu, Wotao Yin

random and cyclic selections are either infeasible or very expensive.

Distributed Optimization

Acceleration of Primal-Dual Methods by Preconditioning and Simple Subproblem Procedures

1 code implementation21 Nov 2018 Yanli Liu, Yunbei Xu, Wotao Yin

They reduce a difficult problem to simple subproblems, so they are easy to implement and have many applications.

Optimization and Control

On Markov Chain Gradient Descent

no code implementations NeurIPS 2018 Tao Sun, Yuejiao Sun, Wotao Yin

This paper studies Markov chain gradient descent, a variant of stochastic gradient descent where the random samples are taken on the trajectory of a Markov chain.

LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning

1 code implementation NeurIPS 2018 Tianyi Chen, Georgios B. Giannakis, Tao Sun, Wotao Yin

This paper presents a new class of gradient methods for distributed machine learning that adaptively skip the gradient calculations to learn with reduced communication and computation.

Redundancy Techniques for Straggler Mitigation in Distributed Optimization and Learning

no code implementations14 Mar 2018 Can Karakus, Yifan Sun, Suhas Diggavi, Wotao Yin

Performance of distributed optimization and learning systems is bottlenecked by "straggler" nodes and slow communication links, which significantly delay computation.

Distributed Optimization

Denoising Prior Driven Deep Neural Network for Image Restoration

no code implementations21 Jan 2018 Weisheng Dong, Peiyao Wang, Wotao Yin, Guangming Shi, Fangfang Wu, Xiaotong Lu

Then, the iterative process is unfolded into a deep neural network, which is composed of multiple denoisers modules interleaved with back-projection (BP) modules that ensure the observation consistencies.

Deblurring Image Denoising +2

Run-and-Inspect Method for Nonconvex Optimization and Global Optimality Bounds for R-Local Minimizers

no code implementations22 Nov 2017 Yifan Chen, Yuejiao Sun, Wotao Yin

If no sufficient decrease is found, the current point is called an approximate $R$-local minimizer.

First and Second Order Methods for Online Convolutional Dictionary Learning

no code implementations31 Aug 2017 Jialin Liu, Cristina Garcia-Cardona, Brendt Wohlberg, Wotao Yin

Convolutional sparse representations are a form of sparse representation with a structured, translation invariant dictionary.

Dictionary Learning Second-order methods +1

Online Convolutional Dictionary Learning

no code implementations29 Jun 2017 Jialin Liu, Cristina Garcia-Cardona, Brendt Wohlberg, Wotao Yin

While a number of different algorithms have recently been proposed for convolutional dictionary learning, this remains an expensive problem.

Dictionary Learning

On the Convergence of Asynchronous Parallel Iteration with Unbounded Delays

no code implementations13 Dec 2016 Zhimin Peng, Yangyang Xu, Ming Yan, Wotao Yin

Recent years have witnessed the surge of asynchronous parallel (async-parallel) iterative algorithms due to problems involving very large-scale data and a large number of decision variables.

Cyclic Coordinate Update Algorithms for Fixed-Point Problems: Analysis and Applications

no code implementations8 Nov 2016 Yat Tin Chow, Tianyu Wu, Wotao Yin

To this problem, we apply the coordinate-update algorithms, which update only one or a few components of $x$ at each step.

Optimization and Control Computation 90C06, 90C25, 65K05

A Primer on Coordinate Descent Algorithms

no code implementations30 Sep 2016 Hao-Jun Michael Shi, Shenyinying Tu, Yangyang Xu, Wotao Yin

This monograph presents a class of algorithms called coordinate descent algorithms for mathematicians, statisticians, and engineers outside the field of optimization.

Distributed Computing Machine Learning

On Unbounded Delays in Asynchronous Parallel Fixed-Point Algorithms

no code implementations15 Sep 2016 Robert Hannah, Wotao Yin

Existing analysis of ARock assumes the delays to be bounded, and uses this bound to set a step size that is important to both convergence and efficiency.

Coordinate Friendly Structures, Algorithms and Applications

no code implementations5 Jan 2016 Zhimin Peng, Tianyu Wu, Yangyang Xu, Ming Yan, Wotao Yin

To derive simple subproblems for several new classes of applications, this paper systematically studies coordinate-friendly operators that perform low-cost coordinate updates.

Optimal Sparse Kernel Learning for Hyperspectral Anomaly Detection

no code implementations8 Jun 2015 Zhimin Peng, Prudhvi Gurram, Heesung Kwon, Wotao Yin

In this paper, a novel framework of sparse kernel learning for Support Vector Data Description (SVDD) based anomaly detection is presented.

Anomaly Detection feature selection

A fast patch-dictionary method for whole image recovery

no code implementations16 Aug 2014 Yangyang Xu, Wotao Yin

With very few exceptions, this issue has limited the applications of image-patch methods to the local kind of tasks such as denoising, inpainting, cartoon-texture decomposition, super-resolution, and image deblurring, for which one can process a few patches at a time.

Compressive Sensing Deblurring +5

Block stochastic gradient iteration for convex and nonconvex optimization

no code implementations12 Aug 2014 Yangyang Xu, Wotao Yin

Its convergence for both convex and nonconvex cases are established in different senses.

Stochastic Optimization

Sparse Recovery via Differential Inclusions

1 code implementation30 Jun 2014 Stanley Osher, Feng Ruan, Jiechao Xiong, Yuan YAO, Wotao Yin

In this paper, we recover sparse signals from their noisy linear measurements by solving nonlinear differential inclusions, which is based on the notion of inverse scale space (ISS) developed in applied mathematics.

EXTRA: An Exact First-Order Algorithm for Decentralized Consensus Optimization

no code implementations24 Apr 2014 Wei Shi, Qing Ling, Gang Wu, Wotao Yin

In this paper, we develop a decentralized algorithm for the consensus optimization problem $$\min\limits_{x\in\mathbb{R}^p}~\bar{f}(x)=\frac{1}{n}\sum\limits_{i=1}^n f_i(x),$$ which is defined over a connected network of $n$ agents, where each function $f_i$ is held privately by agent $i$ and encodes the agent's data and objective.

Optimization and Control

Video Compressive Sensing for Dynamic MRI

no code implementations30 Jan 2014 Jianing V. Shi, Wotao Yin, Aswin C. Sankaranarayanan, Richard G. Baraniuk

We apply this framework to accelerate the acquisition process of dynamic MRI and show it achieves the best reconstruction accuracy with the least computational time compared with existing algorithms in the literature.

Compressive Sensing Video Compressive Sensing

Parallel matrix factorization for low-rank tensor completion

1 code implementation4 Dec 2013 Yangyang Xu, Ruru Hao, Wotao Yin, Zhixun Su

Phase transition plots reveal that our algorithm can recover a variety of synthetic low-rank tensors from significantly fewer samples than the compared methods, which include a matrix completion method applied to tensor recovery and two state-of-the-art tensor completion methods.

Numerical Analysis Numerical Analysis Computation

An Alternating Direction Algorithm for Matrix Completion with Nonnegative Factors

no code implementations6 Mar 2011 Yangyang Xu, Wotao Yin, Zaiwen Wen, Yin Zhang

By taking the advantages of both nonnegativity and low-rankness, one can generally obtain superior results than those of just using one of the two properties.

Information Theory Numerical Analysis Information Theory Numerical Analysis

Cannot find the paper you are looking for? You can Submit a new open access paper.