Search Results for author: Tianbao Yang

Found 119 papers, 12 papers with code

Quadratically Regularized Subgradient Methods for Weakly Convex Optimization with Weakly Convex Constraints

no code implementations ICML 2020 Runchao Ma, Qihang Lin, Tianbao Yang

Optimization models with non-convex constraints arise in many tasks in machine learning, e. g., learning with fairness constraints or Neyman-Pearson classification with non-convex loss.

Fairness

Accelerating Deep Learning with Millions of Classes

no code implementations ECCV 2020 Zhuoning Yuan, Zhishuai Guo, Xiaotian Yu, Xiaoyu Wang, Tianbao Yang

In our experiment, we demonstrate that the proposed frame-work is able to train deep learning models with millions of classes and achieve above 10×speedup compared to existing approaches.

Classification Frame +2

Smoothed Online Convex Optimization Based on Discounted-Normal-Predictor

no code implementations2 May 2022 Lijun Zhang, Wei Jiang, JinFeng Yi, Tianbao Yang

In this paper, we investigate an online prediction strategy named as Discounted-Normal-Predictor (Kapralov and Panigrahy, 2010) for smoothed online convex optimization (SOCO), in which the learner needs to minimize not only the hitting cost but also the switching cost.

AUC Maximization in the Era of Big Data and AI: A Survey

no code implementations28 Mar 2022 Tianbao Yang, Yiming Ying

We also identify and discuss remaining and emerging issues for deep AUC maximization, and provide suggestions on topics for future work.

Benchmarking Deep AUROC Optimization: Loss Functions and Algorithmic Choices

no code implementations27 Mar 2022 Dixian Zhu, Xiaodong Wu, Tianbao Yang

(i) We benchmark a variety of loss functions with different algorithmic choices for deep AUROC optimization problem.

imbalanced classification

Large-scale Optimization of Partial AUC in a Range of False Positive Rates

no code implementations3 Mar 2022 Yao Yao, Qihang Lin, Tianbao Yang

The partial AUC, as a generalization of the AUC, summarizes only the TPRs over a specific range of the FPRs and is thus a more suitable performance measure in many real-world situations.

When AUC meets DRO: Optimizing Partial AUC for Deep Learning with Non-Convex Convergence Guarantee

no code implementations1 Mar 2022 Dixian Zhu, Gang Li, Bokun Wang, Xiaodong Wu, Tianbao Yang

In this paper, we propose systematic and efficient gradient-based methods for both one-way and two-way partial AUC (pAUC) maximization that are applicable to deep learning.

Large-scale Stochastic Optimization of NDCG Surrogates for Deep Learning with Provable Convergence

no code implementations24 Feb 2022 Zi-Hao Qiu, Quanqi Hu, Yongjian Zhong, Lijun Zhang, Tianbao Yang

To the best of our knowledge, this is the first time that stochastic algorithms are proposed to optimize NDCG with a provable convergence guarantee.

Information Retrieval Stochastic Optimization

Finite-Sum Coupled Compositional Stochastic Optimization: Theory and Applications

no code implementations24 Feb 2022 Bokun Wang, Tianbao Yang

This paper studies stochastic optimization for a sum of compositional functions, where the inner-level function of each summand is coupled with the corresponding summation index.

Meta-Learning Stochastic Optimization +1

Provable Stochastic Optimization for Global Contrastive Learning: Small Batch Does Not Harm Performance

1 code implementation24 Feb 2022 Zhuoning Yuan, Yuexin Wu, Zihao Qiu, Xianzhi Du, Lijun Zhang, Denny Zhou, Tianbao Yang

From the optimization perspective, we explain why existing methods such as SimCLR requires a large batch size in order to achieve a satisfactory result.

Contrastive Learning Stochastic Optimization

Optimal Algorithms for Stochastic Multi-Level Compositional Optimization

no code implementations15 Feb 2022 Wei Jiang, Bokun Wang, Yibo Wang, Lijun Zhang, Tianbao Yang

To address this limitation, we propose a Stochastic Multi-level Variance Reduction method (SMVR), which achieves the optimal sample complexity of $\mathcal{O}\left(1 / \epsilon^{3}\right)$ to find an $\epsilon$-stationary point for non-convex objectives.

A Unified DRO View of Multi-class Loss Functions with top-N Consistency

no code implementations30 Dec 2021 Dixian Zhu, Tianbao Yang

In this paper, we present a unified view of the CS/CE losses and their smoothed top-$k$ variants by proposing a new family of loss functions, which are arguably better than the CS/CE losses when the given label information is incomplete and noisy.

Multi-class Classification

A Novel Convergence Analysis for Algorithms of the Adam Family

no code implementations7 Dec 2021 Zhishuai Guo, Yi Xu, Wotao Yin, Rong Jin, Tianbao Yang

Although rigorous convergence analysis exists for Adam, they impose specific requirements on the update of the adaptive step size, which are not generic enough to cover many other variants of Adam.

Bilevel Optimization

Simple Stochastic and Online Gradient Descent Algorithms for Pairwise Learning

no code implementations NeurIPS 2021 Zhenhuan Yang, Yunwen Lei, Puyu Wang, Tianbao Yang, Yiming Ying

A popular approach to handle streaming data in pairwise learning is an online gradient descent (OGD) algorithm, where one needs to pair the current instance with a buffering set of previous instances with a sufficiently large size and therefore suffers from a scalability issue.

Generalization Bounds Metric Learning

Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise Learning

1 code implementation23 Nov 2021 Zhenhuan Yang, Yunwen Lei, Puyu Wang, Tianbao Yang, Yiming Ying

A popular approach to handle streaming data in pairwise learning is an online gradient descent (OGD) algorithm, where one needs to pair the current instance with a buffering set of previous instances with a sufficiently large size and therefore suffers from a scalability issue.

Generalization Bounds Metric Learning

Deep AUC Maximization for Medical Image Classification: Challenges and Opportunities

no code implementations1 Nov 2021 Tianbao Yang

In this extended abstract, we will present and discuss opportunities and challenges brought about by a new deep learning method by AUC maximization (aka \underline{\bf D}eep \underline{\bf A}UC \underline{\bf M}aximization or {\bf DAM}) for medical image classification.

Classification Image Classification

Compositional Training for End-to-End Deep AUC Maximization

no code implementations ICLR 2022 Zhuoning Yuan, Zhishuai Guo, Nitesh Chawla, Tianbao Yang

The key idea of compositional training is to minimize a compositional objective function, where the outer function corresponds to an AUC loss and the inner function represents a gradient descent step for minimizing a traditional loss, e. g., the cross-entropy (CE) loss.

Image Classification Stochastic Optimization

Momentum Accelerates the Convergence of Stochastic AUPRC Maximization

no code implementations2 Jul 2021 Guanghui Wang, Ming Yang, Lijun Zhang, Tianbao Yang

In this paper, we further improve the stochastic optimization of AURPC by (i) developing novel stochastic momentum methods with a better iteration complexity of $O(1/\epsilon^4)$ for finding an $\epsilon$-stationary solution; and (ii) designing a novel family of stochastic adaptive methods with the same iteration complexity, which enjoy faster convergence in practice.

imbalanced classification Stochastic Optimization

Memory-based Optimization Methods for Model-Agnostic Meta-Learning

no code implementations9 Jun 2021 Bokun Wang, Zhuoning Yuan, Yiming Ying, Tianbao Yang

This paper addresses these issues by (i) proposing efficient memory-based stochastic algorithms for MAML with the vanishing convergence error, which only requires sampling a constant number of tasks and a constant number of data samples per-iteration; (ii) proposing communication-efficient distributed memory-based MAML algorithms for personalized federated learning in both the cross-device (with client sampling) and the cross-silo (without client sampling) settings.

Continual Learning Meta-Learning +2

A Simple yet Universal Strategy for Online Convex Optimization

no code implementations8 May 2021 Lijun Zhang, Guanghui Wang, JinFeng Yi, Tianbao Yang

In this paper, we propose a simple strategy for universal online convex optimization, which avoids these limitations.

Stability and Generalization of Stochastic Gradient Methods for Minimax Problems

1 code implementation8 May 2021 Yunwen Lei, Zhenhuan Yang, Tianbao Yang, Yiming Ying

In this paper, we provide a comprehensive generalization analysis of stochastic gradient methods for minimax problems under both convex-concave and nonconvex-nonconcave cases through the lens of algorithmic stability.

Generalization Bounds

Randomized Stochastic Variance-Reduced Methods for Multi-Task Stochastic Bilevel Optimization

no code implementations5 May 2021 Zhishuai Guo, Quanqi Hu, Lijun Zhang, Tianbao Yang

Although numerous studies have proposed stochastic algorithms for solving these problems, they are limited in two perspectives: (i) their sample complexities are high, which do not match the state-of-the-art result for non-convex stochastic optimization; (ii) their algorithms are tailored to problems with only one lower-level problem.

Bilevel Optimization Stochastic Optimization

A Novel Convergence Analysis for Algorithms of the Adam Family and Beyond

no code implementations30 Apr 2021 Zhishuai Guo, Yi Xu, Wotao Yin, Rong Jin, Tianbao Yang

Our analysis exhibits that an increasing or large enough "momentum" parameter for the first-order moment used in practice is sufficient to ensure Adam and its many variants converge under a mild boundness condition on the adaptive scaling factor of the step size.

Bilevel Optimization

Online Convex Optimization with Continuous Switching Constraint

no code implementations NeurIPS 2021 Guanghui Wang, Yuanyu Wan, Tianbao Yang, Lijun Zhang

To control the switching cost, we introduce the problem of online convex optimization with continuous switching constraint, where the goal is to achieve a small regret given a budget on the \emph{overall} switching cost.

Decision Making

Revisiting Smoothed Online Learning

no code implementations NeurIPS 2021 Lijun Zhang, Wei Jiang, Shiyin Lu, Tianbao Yang

Moreover, when the hitting cost is both convex and $\lambda$-quadratic growth, we reduce the competitive ratio to $1 + \frac{2}{\sqrt{\lambda}}$ by minimizing the weighted sum of the hitting cost and the switching cost.

online learning

Federated Deep AUC Maximization for Heterogeneous Data with a Constant Communication Complexity

1 code implementation9 Feb 2021 Zhuoning Yuan, Zhishuai Guo, Yi Xu, Yiming Ying, Tianbao Yang

Deep AUC (area under the ROC curve) Maximization (DAM) has attracted much attention recently due to its great potential for imbalanced data classification.

Federated Learning

Attentional Biased Stochastic Gradient for Imbalanced Classification

no code implementations13 Dec 2020 Qi Qi, Yi Xu, Rong Jin, Wotao Yin, Tianbao Yang

In this paper, we present a simple yet effective method (ABSGD) for addressing the data imbalance issue in deep learning.

Classification General Classification +2

Large-scale Robust Deep AUC Maximization: A New Surrogate Loss and Empirical Studies on Medical Image Classification

3 code implementations ICCV 2021 Zhuoning Yuan, Yan Yan, Milan Sonka, Tianbao Yang

Our studies demonstrate that the proposed DAM method improves the performance of optimizing cross-entropy loss by a large margin, and also achieves better performance than optimizing the existing AUC square loss on these medical image classification tasks.

Classification General Classification +3

Advanced Graph and Sequence Neural Networks for Molecular Property Prediction and Drug Discovery

1 code implementation2 Dec 2020 Zhengyang Wang, Meng Liu, Youzhi Luo, Zhao Xu, Yaochen Xie, Limei Wang, Lei Cai, Qi Qi, Zhuoning Yuan, Tianbao Yang, Shuiwang Ji

Here we develop a suite of comprehensive machine learning methods and tools spanning different computational models, molecular representations, and loss functions for molecular property prediction and drug discovery.

Drug Discovery Molecular Property Prediction

Variance-Reduced Off-Policy Memory-Efficient Policy Search

no code implementations14 Sep 2020 Daoming Lyu, Qi Qi, Mohammad Ghavamzadeh, Hengshuai Yao, Tianbao Yang, Bo Liu

To achieve variance-reduced off-policy-stable policy optimization, we propose an algorithm family that is memory-efficient, stochastically variance-reduced, and capable of learning from off-policy samples.

reinforcement-learning Stochastic Optimization

An Online Method for A Class of Distributionally Robust Optimization with Non-Convex Objectives

1 code implementation NeurIPS 2021 Qi Qi, Zhishuai Guo, Yi Xu, Rong Jin, Tianbao Yang

In this paper, we propose a practical online method for solving a class of distributionally robust optimization (DRO) with non-convex objectives, which has important applications in machine learning for improving the robustness of neural networks.

online learning

Nearly Optimal Robust Method for Convex Compositional Problems with Heavy-Tailed Noise

no code implementations17 Jun 2020 Yan Yan, Xin Man, Tianbao Yang

In this paper, we propose robust stochastic algorithms for solving convex compositional problems of the form $f(\E_\xi g(\cdot; \xi)) + r(\cdot)$ by establishing {\bf sub-Gaussian confidence bounds} under weak assumptions about the tails of noise distribution, i. e., {\bf heavy-tailed noise} with bounded second-order moments.

Fast Objective & Duality Gap Convergence for Nonconvex-Strongly-Concave Min-Max Problems

no code implementations12 Jun 2020 Zhishuai Guo, Yan Yan, Zhuoning Yuan, Tianbao Yang

Compared with existing studies, (i) our analysis is based on a novel Lyapunov function consisting of the primal objective gap and the duality gap of a regularized function, and (ii) the results are more comprehensive with improved rates that have better dependence on the condition number under different assumptions.

Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks

1 code implementation ICML 2020 Zhishuai Guo, Mingrui Liu, Zhuoning Yuan, Li Shen, Wei Liu, Tianbao Yang

In this paper, we study distributed algorithms for large-scale AUC maximization with a deep neural network as a predictive model.

Distributed Optimization

Revisiting SGD with Increasingly Weighted Averaging: Optimization and Generalization Perspectives

no code implementations9 Mar 2020 Zhishuai Guo, Yan Yan, Tianbao Yang

It remains unclear how these averaging schemes affect the convergence of {\it both optimization error and generalization error} (two equally important components of testing error) for {\bf non-strongly convex objectives, including non-convex problems}.

Optimal Epoch Stochastic Gradient Descent Ascent Methods for Min-Max Optimization

no code implementations NeurIPS 2020 Yan Yan, Yi Xu, Qihang Lin, Wei Liu, Tianbao Yang

In this paper, we bridge this gap by providing a sharp analysis of epoch-wise stochastic gradient descent ascent method (referred to as Epoch-GDA) for solving strongly convex strongly concave (SCSC) min-max problems, without imposing any additional assumption about smoothness or the function's structure.

Minimizing Dynamic Regret and Adaptive Regret Simultaneously

no code implementations6 Feb 2020 Lijun Zhang, Shiyin Lu, Tianbao Yang

To address this limitation, new performance measures, including dynamic regret and adaptive regret have been proposed to guide the design of online algorithms.

online learning

Attacking Lifelong Learning Models with Gradient Reversion

no code implementations ICLR 2020 Yunhui Guo, Mingrui Liu, Yandong Li, Liqiang Wang, Tianbao Yang, Tajana Rosing

We evaluate the effectiveness of traditional attack methods such as FGSM and PGD. The results show that A-GEM still possesses strong continual learning ability in the presence of adversarial examples in the memory and simple defense techniques such as label smoothing can further alleviate the adversarial effects.

Continual Learning

Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets

no code implementations ICLR 2020 Mingrui Liu, Youssef Mroueh, Jerret Ross, Wei zhang, Xiaodong Cui, Payel Das, Tianbao Yang

Then we propose an adaptive variant of OSG named Optimistic Adagrad (OAdagrad) and reveal an \emph{improved} adaptive complexity $O\left(\epsilon^{-\frac{2}{1-\alpha}}\right)$, where $\alpha$ characterizes the growth rate of the cumulative stochastic gradient and $0\leq \alpha\leq 1/2$.

A Simple and Effective Framework for Pairwise Deep Metric Learning

1 code implementation ECCV 2020 Qi Qi, Yan Yan, Xiaoyu Wang, Tianbao Yang

To tackle this issue, we propose a simple and effective framework to sample pairs in a batch of data for updating the model.

Metric Learning

A Decentralized Parallel Algorithm for Training Generative Adversarial Nets

no code implementations NeurIPS 2020 Mingrui Liu, Wei zhang, Youssef Mroueh, Xiaodong Cui, Jerret Ross, Tianbao Yang, Payel Das

Despite recent progress on decentralized algorithms for training deep neural networks, it remains unclear whether it is possible to train GANs in a decentralized manner.

Improved Schemes for Episodic Memory-based Lifelong Learning

1 code implementation NeurIPS 2020 Yunhui Guo, Mingrui Liu, Tianbao Yang, Tajana Rosing

This view leads to two improved schemes for episodic memory based lifelong learning, called MEGA-I and MEGA-II.

Learning with Long-term Remembering: Following the Lead of Mixed Stochastic Gradient

no code implementations25 Sep 2019 Yunhui Guo, Mingrui Liu, Tianbao Yang, Tajana Rosing

In this paper, we introduce a novel and effective lifelong learning algorithm, called MixEd stochastic GrAdient (MEGA), which allows deep neural networks to acquire the ability of retaining performance on old tasks while learning new tasks.

Stochastic AUC Maximization with Deep Neural Networks

no code implementations ICLR 2020 Mingrui Liu, Zhuoning Yuan, Yiming Ying, Tianbao Yang

In this paper, we consider stochastic AUC maximization problem with a deep neural network as the predictive model.

Stochastic Optimization for Non-convex Inf-Projection Problems

no code implementations ICML 2020 Yan Yan, Yi Xu, Lijun Zhang, Xiaoyu Wang, Tianbao Yang

In this paper, we study a family of non-convex and possibly non-smooth inf-projection minimization problems, where the target objective function is equal to minimization of a joint function over another variable.

Stochastic Optimization

A Data Efficient and Feasible Level Set Method for Stochastic Convex Optimization with Expectation Constraints

no code implementations7 Aug 2019 Qihang Lin, Selvaprabu Nadarajah, Negar Soheili, Tianbao Yang

We design a stochastic feasible level set method (SFLS) for SOECs that has low data complexity and emphasizes feasibility before convergence.

Stochastic Primal-Dual Algorithms with Faster Convergence than $O(1/\sqrt{T})$ for Problems without Bilinear Structure

no code implementations23 Apr 2019 Yan Yan, Yi Xu, Qihang Lin, Lijun Zhang, Tianbao Yang

The main contribution of this paper is the design and analysis of new stochastic primal-dual algorithms that use a mixture of stochastic gradient updates and a logarithmic number of deterministic dual updates for solving a family of convex-concave problems with no bilinear structure assumed.

Stagewise Training Accelerates Convergence of Testing Error Over SGD

no code implementations NeurIPS 2019 Zhuoning Yuan, Yan Yan, Rong Jin, Tianbao Yang

For convex loss functions and two classes of "nice-behaviored" non-convex objectives that are close to a convex function, we establish faster convergence of stagewise training than the vanilla SGD under the PL condition on both training error and testing error.

Adaptive Negative Curvature Descent with Applications in Non-convex Optimization

no code implementations NeurIPS 2018 Mingrui Liu, Zhe Li, Xiaoyu Wang, Jin-Feng Yi, Tianbao Yang

Negative curvature descent (NCD) method has been utilized to design deterministic or stochastic algorithms for non-convex optimization aiming at finding second-order stationary points or local minima.

Faster Online Learning of Optimal Threshold for Consistent F-measure Optimization

no code implementations NeurIPS 2018 Xiaoxuan Zhang, Mingrui Liu, Xun Zhou, Tianbao Yang

To advance OFO, we propose an efficient online algorithm based on simultaneously learning a posterior probability of class and learning an optimal threshold by minimizing a stochastic strongly convex function with unknown strong convexity parameter.

online learning

Stochastic Optimization for DC Functions and Non-smooth Non-convex Regularizers with Non-asymptotic Convergence

no code implementations28 Nov 2018 Yi Xu, Qi Qi, Qihang Lin, Rong Jin, Tianbao Yang

In this paper, we propose new stochastic optimization algorithms and study their first-order convergence theories for solving a broad family of DC functions.

Stochastic Optimization

First-order Convergence Theory for Weakly-Convex-Weakly-Concave Min-max Problems

no code implementations24 Oct 2018 Mingrui Liu, Hassan Rafique, Qihang Lin, Tianbao Yang

In this paper, we consider first-order convergence theory and algorithms for solving a class of non-convex non-concave min-max saddle-point problems, whose objective function is weakly convex in the variables of minimization and weakly concave in the variables of maximization.

Weakly-Convex Concave Min-Max Optimization: Provable Algorithms and Applications in Machine Learning

no code implementations4 Oct 2018 Hassan Rafique, Mingrui Liu, Qihang Lin, Tianbao Yang

Min-max problems have broad applications in machine learning, including learning with non-decomposable loss and learning with robustness to data distribution.

Learning Discriminators as Energy Networks in Adversarial Learning

no code implementations ICLR 2019 Pingbo Pan, Yan Yan, Tianbao Yang, Yi Yang

In this work, we propose to refine the predictions of structured prediction models by effectively integrating discriminative models into the prediction.

Multi-Label Classification Semantic Segmentation +1

A Unified Analysis of Stochastic Momentum Methods for Deep Learning

no code implementations30 Aug 2018 Yan Yan, Tianbao Yang, Zhe Li, Qihang Lin, Yi Yang

However, their theoretical analysis of convergence of the training objective and the generalization error for prediction is still under-explored.

Universal Stagewise Learning for Non-Convex Problems with Convergence on Averaged Solutions

no code implementations ICLR 2019 Zaiyi Chen, Zhuoning Yuan, Jin-Feng Yi, Bo-Wen Zhou, Enhong Chen, Tianbao Yang

For example, there is still a lack of theories of convergence for SGD and its variants that use stagewise step size and return an averaged solution in practice.

Improving Sequential Determinantal Point Processes for Supervised Video Summarization

no code implementations ECCV 2018 Aidean Sharghi, Ali Borji, Chengtao Li, Tianbao Yang, Boqing Gong

In terms of modeling, we design a new probabilistic distribution such that, when it is integrated into SeqDPP, the resulting model accepts user input about the expected length of the summary.

Point Processes Supervised Video Summarization

Fast Stochastic AUC Maximization with $O(1/n)$-Convergence Rate

no code implementations ICML 2018 Mingrui Liu, Xiaoxuan Zhang, Zaiyi Chen, Xiaoyu Wang, Tianbao Yang

In this paper, we consider statistical learning with AUC (area under ROC curve) maximization in the classical stochastic setting where one random data drawn from an unknown distribution is revealed at each iteration for updating the model.

Level-Set Methods for Finite-Sum Constrained Convex Optimization

no code implementations ICML 2018 Qihang Lin, Runchao Ma, Tianbao Yang

To update the level parameter towards the optimality, both methods require an oracle that generates upper and lower bounds as well as an affine-minorant of the level function.

SADAGRAD: Strongly Adaptive Stochastic Gradient Methods

no code implementations ICML 2018 Zaiyi Chen, Yi Xu, Enhong Chen, Tianbao Yang

Although the convergence rates of existing variants of ADAGRAD have a better dependence on the number of iterations under the strong convexity condition, their iteration complexities have a explicitly linear dependence on the dimensionality of the problem.

EIGEN: Ecologically-Inspired GENetic Approach for Neural Network Structure Searching from Scratch

no code implementations CVPR 2019 Jian Ren, Zhe Li, Jianchao Yang, Ning Xu, Tianbao Yang, David J. Foran

In this paper, we propose an Ecologically-Inspired GENetic (EIGEN) approach that uses the concept of succession, extinction, mimicry, and gene duplication to search neural network structure from scratch with poorly initialized simple network and few constraints forced during the evolution, as we assume no prior knowledge about the task domain.

An Aggressive Genetic Programming Approach for Searching Neural Network Structure Under Computational Constraints

no code implementations3 Jun 2018 Zhe Li, Xuehan Xiong, Zhou Ren, Ning Zhang, Xiaoyu Wang, Tianbao Yang

In this paper, we study how to design a genetic programming approach for optimizing the structure of a CNN for a given task under limited computational resources yet without imposing strong restrictions on the search space.

Learning with Non-Convex Truncated Losses by SGD

no code implementations21 May 2018 Yi Xu, Shenghuo Zhu, Sen yang, Chi Zhang, Rong Jin, Tianbao Yang

Learning with a {\it convex loss} function has been a dominating paradigm for many years.

Fast Rates of ERM and Stochastic Approximation: Adaptive to Error Bound Conditions

no code implementations NeurIPS 2018 Mingrui Liu, Xiaoxuan Zhang, Lijun Zhang, Rong Jin, Tianbao Yang

Error bound conditions (EBC) are properties that characterize the growth of an objective function when a point is moved away from the optimal set.

NEON+: Accelerated Gradient Methods for Extracting Negative Curvature for Non-Convex Optimization

no code implementations4 Dec 2017 Yi Xu, Rong Jin, Tianbao Yang

Accelerated gradient (AG) methods are breakthroughs in convex optimization, improving the convergence rate of the gradient descent method for optimization with smooth functions.

Adaptive SVRG Methods under Error Bound Conditions with Unknown Growth Parameter

no code implementations NeurIPS 2017 Yi Xu, Qihang Lin, Tianbao Yang

The most studied error bound is the quadratic error bound, which generalizes strong convexity and is satisfied by a large family of machine learning problems.

Stochastic Optimization

Adaptive Accelerated Gradient Converging Method under H\"{o}lderian Error Bound Condition

no code implementations NeurIPS 2017 Mingrui Liu, Tianbao Yang

Recent studies have shown that proximal gradient (PG) method and accelerated gradient method (APG) with restarting can enjoy a linear convergence under a weaker condition than strong convexity, namely a quadratic growth condition (QGC).

ADMM without a Fixed Penalty Parameter: Faster Convergence with New Adaptive Penalization

no code implementations NeurIPS 2017 Yi Xu, Mingrui Liu, Qihang Lin, Tianbao Yang

The novelty of the proposed scheme lies at that it is adaptive to a local sharpness property of the objective function, which marks the key difference from previous adaptive scheme that adjusts the penalty parameter per-iteration based on certain conditions on iterates.

Stochastic Optimization

First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time

no code implementations NeurIPS 2018 Yi Xu, Rong Jin, Tianbao Yang

Two classes of methods have been proposed for escaping from saddle points with one using the second-order information carried by the Hessian and the other adding the noise into the first-order information.

Stochastic Non-convex Optimization with Strong High Probability Second-order Convergence

no code implementations25 Oct 2017 Mingrui Liu, Tianbao Yang

In this paper, we study stochastic non-convex optimization with non-convex random functions.

On Noisy Negative Curvature Descent: Competing with Gradient Descent for Faster Non-convex Optimization

no code implementations25 Sep 2017 Mingrui Liu, Tianbao Yang

To the best of our knowledge, the proposed stochastic algorithm is the first one that converges to a second-order stationary point in {\it high probability} with a time complexity independent of the sample size and almost linear in dimensionality.

A Simple Analysis for Exp-concave Empirical Minimization with Arbitrary Convex Regularizer

no code implementations9 Sep 2017 Tianbao Yang, Zhe Li, Lijun Zhang

In this paper, we present a simple analysis of {\bf fast rates} with {\it high probability} of {\bf empirical minimization} for {\it stochastic composite optimization} over a finite-dimensional bounded convex set with exponential concave loss functions and an arbitrary convex regularization.

Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence

no code implementations ICML 2017 Yi Xu, Qihang Lin, Tianbao Yang

In this paper, a new theory is developed for first-order stochastic convex optimization, showing that the global convergence rate is sufficiently quantified by a local growth rate of the objective function in a neighborhood of the optimal solutions.

Stochastic Optimization

SEP-Nets: Small and Effective Pattern Networks

no code implementations13 Jun 2017 Zhe Li, Xiaoyu Wang, Xutao Lv, Tianbao Yang

By doing this, we show that previous deep CNNs such as GoogLeNet and Inception-type Nets can be compressed dramatically with marginal drop in performance.

Binarization Quantization

Hybrid safe-strong rules for efficient optimization in lasso-type problems

1 code implementation27 Apr 2017 Yaohui Zeng, Tianbao Yang, Patrick Breheny

However, with the ultrahigh-dimensional, large-scale data sets now collected in many real-world applications, it is important to develop algorithms to solve the lasso that efficiently scale up to problems of this size.

Model Selection

Empirical Risk Minimization for Stochastic Convex Optimization: $O(1/n)$- and $O(1/n^2)$-type of Risk Bounds

no code implementations7 Feb 2017 Lijun Zhang, Tianbao Yang, Rong Jin

First, we establish an $\widetilde{O}(d/n + \sqrt{F_*/n})$ risk bound when the random function is nonnegative, convex and smooth, and the expected function is Lipschitz continuous, where $d$ is the dimensionality of the problem, $n$ is the number of samples, and $F_*$ is the minimal risk.

Image Classification

Dynamic Regret of Strongly Adaptive Methods

no code implementations ICML 2018 Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou

To cope with changing environments, recent developments in online learning have introduced the concepts of adaptive regret and dynamic regret independently.

online learning

Efficient Non-oblivious Randomized Reduction for Risk Minimization with Improved Excess Risk Guarantee

no code implementations6 Dec 2016 Yi Xu, Haiqin Yang, Lijun Zhang, Tianbao Yang

Previously, oblivious random projection based approaches that project high dimensional features onto a random subspace have been used in practice for tackling high-dimensionality challenge in machine learning.

Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than O(1/\epsilon)

no code implementations NeurIPS 2016 Yi Xu, Yan Yan, Qihang Lin, Tianbao Yang

To the best of our knowledge, this is the lowest iteration complexity achieved so far for the considered non-smooth optimization problems without strong convexity assumption.

Adaptive Accelerated Gradient Converging Methods under Holderian Error Bound Condition

no code implementations23 Nov 2016 Mingrui Liu, Tianbao Yang

Recent studies have shown that proximal gradient (PG) method and accelerated gradient method (APG) with restarting can enjoy a linear convergence under a weaker condition than strong convexity, namely a quadratic growth condition (QGC).

Improved Dynamic Regret for Non-degenerate Functions

no code implementations NeurIPS 2017 Lijun Zhang, Tianbao Yang, Jin-Feng Yi, Rong Jin, Zhi-Hua Zhou

When multiple gradients are accessible to the learner, we first demonstrate that the dynamic regret of strongly convex functions can be upper bounded by the minimum of the path-length and the squared path-length.

A Richer Theory of Convex Constrained Optimization with Reduced Projections and Improved Rates

no code implementations ICML 2017 Tianbao Yang, Qihang Lin, Lijun Zhang

In this paper, we develop projection reduced optimization algorithms for both smooth and non-smooth optimization with improved convergence rates under a certain regularity condition of the constraint function.

Metric Learning

Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than $O(1/ε)$

no code implementations NeurIPS 2016 Yi Xu, Yan Yan, Qihang Lin, Tianbao Yang

In this work, we will show that the proposed HOPS achieved a lower iteration complexity of $\widetilde O(1/\epsilon^{1-\theta})$\footnote{$\widetilde O()$ suppresses a logarithmic factor.}

Accelerate Stochastic Subgradient Method by Leveraging Local Growth Condition

no code implementations4 Jul 2016 Yi Xu, Qihang Lin, Tianbao Yang

In particular, if the objective function $F(\mathbf w)$ in the $\epsilon$-sublevel set grows as fast as $\|\mathbf w - \mathbf w_*\|_2^{1/\theta}$, where $\mathbf w_*$ represents the closest optimal solution to $\mathbf w$ and $\theta\in(0, 1]$ quantifies the local growth rate, the iteration complexity of first-order stochastic optimization for achieving an $\epsilon$-optimal solution can be $\widetilde O(1/\epsilon^{2(1-\theta)})$, which is optimal at most up to a logarithmic factor.

Stochastic Optimization

Tracking Slowly Moving Clairvoyant: Optimal Dynamic Regret of Online Learning with True and Noisy Gradient

no code implementations16 May 2016 Tianbao Yang, Lijun Zhang, Rong Jin, Jin-Feng Yi

Secondly, we present a lower bound with noisy gradient feedback and then show that we can achieve optimal dynamic regrets under a stochastic gradient feedback and two-point bandit feedback.

online learning

Learning Attributes Equals Multi-Source Domain Generalization

no code implementations CVPR 2016 Chuang Gan, Tianbao Yang, Boqing Gong

Attributes possess appealing properties and benefit many computer vision problems, such as object recognition, learning with humans in the loop, and image retrieval.

Domain Generalization Image Retrieval +1

Unified Convergence Analysis of Stochastic Momentum Methods for Convex and Non-convex Optimization

no code implementations12 Apr 2016 Tianbao Yang, Qihang Lin, Zhe Li

This paper fills the gap between practice and theory by developing a basic convergence analysis of two stochastic momentum methods, namely stochastic heavy-ball method and the stochastic variant of Nesterov's accelerated gradient method.

Improved Dropout for Shallow and Deep Learning

no code implementations NeurIPS 2016 Zhe Li, Boqing Gong, Tianbao Yang

To exhibit the optimal dropout probabilities, we analyze the shallow learning with multinomial dropout and establish the risk bound for stochastic optimization.

Stochastic Optimization

RSG: Beating Subgradient Method without Smoothness and Strong Convexity

no code implementations9 Dec 2015 Tianbao Yang, Qihang Lin

We show that, when applied to a broad class of convex optimization problems, RSG method can find an $\epsilon$-optimal solution with a low complexity than SG method.

Sparse Learning for Large-scale and High-dimensional Data: A Randomized Convex-concave Optimization Approach

no code implementations12 Nov 2015 Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou

In this paper, we develop a randomized algorithm and theory for learning a sparse model from large-scale and high-dimensional data, which is usually formulated as an empirical risk minimization problem with a sparsity-inducing regularizer.

Sparse Learning

Stochastic Proximal Gradient Descent for Nuclear Norm Regularization

no code implementations5 Nov 2015 Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou

In this paper, we utilize stochastic optimization to reduce the space complexity of convex composite optimization with a nuclear norm regularizer, where the variable is a matrix of size $m \times n$.

Stochastic Optimization

Stochastic subGradient Methods with Linear Convergence for Polyhedral Convex Optimization

no code implementations6 Oct 2015 Tianbao Yang, Qihang Lin

In this paper, we show that simple {Stochastic} subGradient Decent methods with multiple Restarting, named {\bf RSGD}, can achieve a \textit{linear convergence rate} for a class of non-smooth and non-strongly convex optimization problems where the epigraph of the objective function is a polyhedron, to which we refer as {\bf polyhedral convex optimization}.

Online Stochastic Linear Optimization under One-bit Feedback

no code implementations25 Sep 2015 Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou

In this paper, we study a special bandit setting of online stochastic linear optimization, where only one-bit of information is revealed to the learner at each round.

online learning

Doubly Stochastic Primal-Dual Coordinate Method for Bilinear Saddle-Point Problem

no code implementations14 Aug 2015 Adams Wei Yu, Qihang Lin, Tianbao Yang

We propose a doubly stochastic primal-dual coordinate optimization algorithm for empirical risk minimization, which can be formulated as a bilinear saddle-point problem.

Distributed Stochastic Variance Reduced Gradient Methods and A Lower Bound for Communication Complexity

no code implementations27 Jul 2015 Jason D. Lee, Qihang Lin, Tengyu Ma, Tianbao Yang

We also prove a lower bound for the number of rounds of communication for a broad class of distributed first-order methods including the proposed algorithms in this paper.

Distributed Optimization

Fast Sparse Least-Squares Regression with Non-Asymptotic Guarantees

no code implementations18 Jul 2015 Tianbao Yang, Lijun Zhang, Qihang Lin, Rong Jin

In this paper, we study a fast approximation method for {\it large-scale high-dimensional} sparse least-squares regression problem by exploiting the Johnson-Lindenstrauss (JL) transforms, which embed a set of high-dimensional vectors into a low-dimensional space.

Hyper-Class Augmented and Regularized Deep Learning for Fine-Grained Image Classification

no code implementations CVPR 2015 Saining Xie, Tianbao Yang, Xiaoyu Wang, Yuanqing Lin

We demonstrate the success of the proposed framework on two small-scale fine-grained datasets (Stanford Dogs and Stanford Cars) and on a large-scale car dataset that we collected.

Fine-Grained Image Classification General Classification +3

Analysis of Nuclear Norm Regularization for Full-rank Matrix Completion

no code implementations26 Apr 2015 Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou

To the best of our knowledge, this is first time such a relative bound is proved for the regularized formulation of matrix completion.

Low-Rank Matrix Completion

Theory of Dual-sparse Regularized Randomized Reduction

no code implementations15 Apr 2015 Tianbao Yang, Lijun Zhang, Rong Jin, Shenghuo Zhu

In this paper, we study randomized reduction methods, which reduce high-dimensional features into low-dimensional space by randomized methods (e. g., random projection, random hashing), for large-scale high-dimensional classification.

General Classification

Object-centric Sampling for Fine-grained Image Classification

no code implementations10 Dec 2014 Xiaoyu Wang, Tianbao Yang, Guobin Chen, Yuanqing Lin

In contrast, this paper proposes an \emph{object-centric sampling} (OCS) scheme that samples image windows based on the object location information.

Classification Fine-Grained Image Classification +2

Extracting Certainty from Uncertainty: Transductive Pairwise Classification from Pairwise Similarities

no code implementations NeurIPS 2014 Tianbao Yang, Rong Jin

In this work, we study the problem of transductive pairwise classification from pairwise similarities~\footnote{The pairwise similarities are usually derived from some side information instead of the underlying class labels.}.

General Classification

On Data Preconditioning for Regularized Loss Minimization

no code implementations13 Aug 2014 Tianbao Yang, Rong Jin, Shenghuo Zhu, Qihang Lin

In this work, we study data preconditioning, a well-known and long-existing technique, for boosting the convergence of first-order methods for regularized loss minimization.

Analysis of Distributed Stochastic Dual Coordinate Ascent

no code implementations4 Dec 2013 Tianbao Yang, Shenghuo Zhu, Rong Jin, Yuanqing Lin

Extraordinary performances have been observed and reported for the well-motivated updates, as referred to the practical updates, compared to the naive updates.

Trading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent

no code implementations NeurIPS 2013 Tianbao Yang

We make a progress along the line by presenting a distributed stochastic dual coordinate ascent algorithm in a star network, with an analysis of the tradeoff between computation and communication.

Distributed Optimization

Stochastic Convex Optimization with Multiple Objectives

no code implementations NeurIPS 2013 Mehrdad Mahdavi, Tianbao Yang, Rong Jin

It leverages on the theory of Lagrangian method in constrained optimization and attains the optimal convergence rate of $[O(1/ \sqrt{T})]$ in high probability for general Lipschitz continuous objectives.

Stochastic Optimization

Optimal Stochastic Strongly Convex Optimization with a Logarithmic Number of Projections

no code implementations19 Apr 2013 Jianhui Chen, Tianbao Yang, Qihang Lin, Lijun Zhang, Yi Chang

We consider stochastic strongly convex optimization with a complex inequality constraint.

O(logT) Projections for Stochastic Optimization of Smooth and Strongly Convex Functions

no code implementations2 Apr 2013 Lijun Zhang, Tianbao Yang, Rong Jin, Xiaofei He

Traditional algorithms for stochastic optimization require projecting the solution at each iteration into a given domain to ensure its feasibility.

Stochastic Optimization

Semi-Crowdsourced Clustering: Generalizing Crowd Labeling by Robust Distance Metric Learning

no code implementations NeurIPS 2012 Jinfeng Yi, Rong Jin, Shaili Jain, Tianbao Yang, Anil K. Jain

One difficulty in learning the pairwise similarity measure is that there is a significant amount of noise and inter-worker variations in the manual annotations obtained via crowdsourcing.

Matrix Completion Metric Learning

Stochastic Gradient Descent with Only One Projection

no code implementations NeurIPS 2012 Mehrdad Mahdavi, Tianbao Yang, Rong Jin, Shenghuo Zhu, Jin-Feng Yi

Although many variants of stochastic gradient descent have been proposed for large-scale convex optimization, most of them require projecting the solution at {\it each} iteration to ensure that the obtained solution stays within the feasible domain.

Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison

no code implementations NeurIPS 2012 Tianbao Yang, Yu-Feng Li, Mehrdad Mahdavi, Rong Jin, Zhi-Hua Zhou

Both random Fourier features and the Nyström method have been successfully applied to efficient kernel learning.

Online Stochastic Optimization with Multiple Objectives

no code implementations26 Nov 2012 Mehrdad Mahdavi, Tianbao Yang, Rong Jin

We first propose a projection based algorithm which attains an $O(T^{-1/3})$ convergence rate.

Stochastic Optimization

An Efficient Primal-Dual Prox Method for Non-Smooth Optimization

no code implementations24 Jan 2012 Tianbao Yang, Mehrdad Mahdavi, Rong Jin, Shenghuo Zhu

We study the non-smooth optimization problems in machine learning, where both the loss function and the regularizer are non-smooth functions.

Cannot find the paper you are looking for? You can Submit a new open access paper.