Search Results for author: Tianbao Yang

One difficulty in learning the pairwise similarity measure is that there is a significant amount of noise and inter-worker variations in the manual annotations obtained via crowdsourcing.

Clustering Computational Efficiency +2

Paper
Add Code

Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison

no code implementations • NeurIPS 2012 • Tianbao Yang, Yu-Feng Li, Mehrdad Mahdavi, Rong Jin, Zhi-Hua Zhou

Both random Fourier features and the Nyström method have been successfully applied to efficient kernel learning.

Paper
Add Code

Stochastic Gradient Descent with Only One Projection

no code implementations • NeurIPS 2012 • Mehrdad Mahdavi, Tianbao Yang, Rong Jin, Shenghuo Zhu, Jin-Feng Yi

Although many variants of stochastic gradient descent have been proposed for large-scale convex optimization, most of them require projecting the solution at {\it each} iteration to ensure that the obtained solution stays within the feasible domain.

Paper
Add Code

O(logT) Projections for Stochastic Optimization of Smooth and Strongly Convex Functions

no code implementations • 2 Apr 2013 • Lijun Zhang, Tianbao Yang, Rong Jin, Xiaofei He

Traditional algorithms for stochastic optimization require projecting the solution at each iteration into a given domain to ensure its feasibility.

Stochastic Optimization

Paper
Add Code

Optimal Stochastic Strongly Convex Optimization with a Logarithmic Number of Projections

no code implementations • 19 Apr 2013 • Jianhui Chen, Tianbao Yang, Qihang Lin, Lijun Zhang, Yi Chang

We consider stochastic strongly convex optimization with a complex inequality constraint.

Paper
Add Code

Stochastic Convex Optimization with Multiple Objectives

no code implementations • NeurIPS 2013 • Mehrdad Mahdavi, Tianbao Yang, Rong Jin

It leverages on the theory of Lagrangian method in constrained optimization and attains the optimal convergence rate of $[O(1/ \sqrt{T})]$ in high probability for general Lipschitz continuous objectives.

Stochastic Optimization

Paper
Add Code

Trading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent

no code implementations • NeurIPS 2013 • Tianbao Yang

We make a progress along the line by presenting a distributed stochastic dual coordinate ascent algorithm in a star network, with an analysis of the tradeoff between computation and communication.

Distributed Optimization

Paper
Add Code

Analysis of Distributed Stochastic Dual Coordinate Ascent

no code implementations • 4 Dec 2013 • Tianbao Yang, Shenghuo Zhu, Rong Jin, Yuanqing Lin

Extraordinary performances have been observed and reported for the well-motivated updates, as referred to the practical updates, compared to the naive updates.

Paper
Add Code

On Data Preconditioning for Regularized Loss Minimization

no code implementations • 13 Aug 2014 • Tianbao Yang, Rong Jin, Shenghuo Zhu, Qihang Lin

In this work, we study data preconditioning, a well-known and long-existing technique, for boosting the convergence of first-order methods for regularized loss minimization.

Paper
Add Code

Extracting Certainty from Uncertainty: Transductive Pairwise Classification from Pairwise Similarities

no code implementations • NeurIPS 2014 • Tianbao Yang, Rong Jin

In this work, we study the problem of transductive pairwise classification from pairwise similarities~\footnote{The pairwise similarities are usually derived from some side information instead of the underlying class labels.}.

General Classification

Paper
Add Code

Object-centric Sampling for Fine-grained Image Classification

no code implementations • 10 Dec 2014 • Xiaoyu Wang, Tianbao Yang, Guobin Chen, Yuanqing Lin

In contrast, this paper proposes an \emph{object-centric sampling} (OCS) scheme that samples image windows based on the object location information.

Classification Fine-Grained Image Classification +4

Paper
Add Code

Theory of Dual-sparse Regularized Randomized Reduction

no code implementations • 15 Apr 2015 • Tianbao Yang, Lijun Zhang, Rong Jin, Shenghuo Zhu

In this paper, we study randomized reduction methods, which reduce high-dimensional features into low-dimensional space by randomized methods (e. g., random projection, random hashing), for large-scale high-dimensional classification.

General Classification

Paper
Add Code

Analysis of Nuclear Norm Regularization for Full-rank Matrix Completion

no code implementations • 26 Apr 2015 • Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou

To the best of our knowledge, this is first time such a relative bound is proved for the regularized formulation of matrix completion.

Low-Rank Matrix Completion

Paper
Add Code

An Explicit Sampling Dependent Spectral Error Bound for Column Subset Selection

no code implementations • 4 May 2015 • Tianbao Yang, Lijun Zhang, Rong Jin, Shenghuo Zhu

In this paper, we consider the problem of column subset selection.

Paper
Add Code

Hyper-Class Augmented and Regularized Deep Learning for Fine-Grained Image Classification

no code implementations • CVPR 2015 • Saining Xie, Tianbao Yang, Xiaoyu Wang, Yuanqing Lin

We demonstrate the success of the proposed framework on two small-scale fine-grained datasets (Stanford Dogs and Stanford Cars) and on a large-scale car dataset that we collected.

Fine-Grained Image Classification General Classification +3

Paper
Add Code

Fast Sparse Least-Squares Regression with Non-Asymptotic Guarantees

no code implementations • 18 Jul 2015 • Tianbao Yang, Lijun Zhang, Qihang Lin, Rong Jin

In this paper, we study a fast approximation method for {\it large-scale high-dimensional} sparse least-squares regression problem by exploiting the Johnson-Lindenstrauss (JL) transforms, which embed a set of high-dimensional vectors into a low-dimensional space.

regression

Paper
Add Code

Distributed Stochastic Variance Reduced Gradient Methods and A Lower Bound for Communication Complexity

no code implementations • 27 Jul 2015 • Jason D. Lee, Qihang Lin, Tengyu Ma, Tianbao Yang

We also prove a lower bound for the number of rounds of communication for a broad class of distributed first-order methods including the proposed algorithms in this paper.

Distributed Optimization

Paper
Add Code

Doubly Stochastic Primal-Dual Coordinate Method for Bilinear Saddle-Point Problem

no code implementations • 14 Aug 2015 • Adams Wei Yu, Qihang Lin, Tianbao Yang

We propose a doubly stochastic primal-dual coordinate optimization algorithm for empirical risk minimization, which can be formulated as a bilinear saddle-point problem.

Paper
Add Code

Online Stochastic Linear Optimization under One-bit Feedback

no code implementations • 25 Sep 2015 • Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou

In this paper, we study a special bandit setting of online stochastic linear optimization, where only one-bit of information is revealed to the learner at each round.

Paper
Add Code

Stochastic subGradient Methods with Linear Convergence for Polyhedral Convex Optimization

no code implementations • 6 Oct 2015 • Tianbao Yang, Qihang Lin

In this paper, we show that simple {Stochastic} subGradient Decent methods with multiple Restarting, named {\bf RSGD}, can achieve a \textit{linear convergence rate} for a class of non-smooth and non-strongly convex optimization problems where the epigraph of the objective function is a polyhedron, to which we refer as {\bf polyhedral convex optimization}.

BIG-bench Machine Learning

Paper
Add Code

Stochastic Proximal Gradient Descent for Nuclear Norm Regularization

no code implementations • 5 Nov 2015 • Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou

In this paper, we utilize stochastic optimization to reduce the space complexity of convex composite optimization with a nuclear norm regularizer, where the variable is a matrix of size $m \times n$.

Stochastic Optimization

Paper
Add Code

Sparse Learning for Large-scale and High-dimensional Data: A Randomized Convex-concave Optimization Approach

no code implementations • 12 Nov 2015 • Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou

In this paper, we develop a randomized algorithm and theory for learning a sparse model from large-scale and high-dimensional data, which is usually formulated as an empirical risk minimization problem with a sparsity-inducing regularizer.

Sparse Learning

Paper
Add Code

RSG: Beating Subgradient Method without Smoothness and Strong Convexity

no code implementations • 9 Dec 2015 • Tianbao Yang, Qihang Lin

We show that, when applied to a broad class of convex optimization problems, RSG method can find an $\epsilon$-optimal solution with a low complexity than SG method.

Paper
Add Code

Improved Dropout for Shallow and Deep Learning

no code implementations • NeurIPS 2016 • Zhe Li, Boqing Gong, Tianbao Yang

To exhibit the optimal dropout probabilities, we analyze the shallow learning with multinomial dropout and establish the risk bound for stochastic optimization.

Stochastic Optimization

Paper
Add Code

Unified Convergence Analysis of Stochastic Momentum Methods for Convex and Non-convex Optimization

no code implementations • 12 Apr 2016 • Tianbao Yang, Qihang Lin, Zhe Li

This paper fills the gap between practice and theory by developing a basic convergence analysis of two stochastic momentum methods, namely stochastic heavy-ball method and the stochastic variant of Nesterov's accelerated gradient method.

Paper
Add Code

Learning Attributes Equals Multi-Source Domain Generalization

no code implementations • CVPR 2016 • Chuang Gan, Tianbao Yang, Boqing Gong

Attributes possess appealing properties and benefit many computer vision problems, such as object recognition, learning with humans in the loop, and image retrieval.

Attribute Domain Generalization +3

Paper
Add Code

Tracking Slowly Moving Clairvoyant: Optimal Dynamic Regret of Online Learning with True and Noisy Gradient

no code implementations • 16 May 2016 • Tianbao Yang, Lijun Zhang, Rong Jin, Jin-Feng Yi

Secondly, we present a lower bound with noisy gradient feedback and then show that we can achieve optimal dynamic regrets under a stochastic gradient feedback and two-point bandit feedback.

Paper
Add Code

Accelerate Stochastic Subgradient Method by Leveraging Local Growth Condition

no code implementations • 4 Jul 2016 • Yi Xu, Qihang Lin, Tianbao Yang

In particular, if the objective function $F(\mathbf w)$ in the $\epsilon$-sublevel set grows as fast as $\|\mathbf w - \mathbf w_*\|_2^{1/\theta}$, where $\mathbf w_*$ represents the closest optimal solution to $\mathbf w$ and $\theta\in(0, 1]$ quantifies the local growth rate, the iteration complexity of first-order stochastic optimization for achieving an $\epsilon$-optimal solution can be $\widetilde O(1/\epsilon^{2(1-\theta)})$, which is optimal at most up to a logarithmic factor.

Stochastic Optimization

Paper
Add Code

Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than $O(1/ε)$

no code implementations • NeurIPS 2016 • Yi Xu, Yan Yan, Qihang Lin, Tianbao Yang

In this work, we will show that the proposed HOPS achieved a lower iteration complexity of $\widetilde O(1/\epsilon^{1-\theta})$\footnote{$\widetilde O()$ suppresses a logarithmic factor.}

Paper
Add Code

A Richer Theory of Convex Constrained Optimization with Reduced Projections and Improved Rates

no code implementations • ICML 2017 • Tianbao Yang, Qihang Lin, Lijun Zhang

In this paper, we develop projection reduced optimization algorithms for both smooth and non-smooth optimization with improved convergence rates under a certain regularity condition of the constraint function.

Metric Learning

Paper
Add Code

Improved Dynamic Regret for Non-degenerate Functions

no code implementations • NeurIPS 2017 • Lijun Zhang, Tianbao Yang, Jin-Feng Yi, Rong Jin, Zhi-Hua Zhou

When multiple gradients are accessible to the learner, we first demonstrate that the dynamic regret of strongly convex functions can be upper bounded by the minimum of the path-length and the squared path-length.

Paper
Add Code

Adaptive Accelerated Gradient Converging Methods under Holderian Error Bound Condition

no code implementations • 23 Nov 2016 • Mingrui Liu, Tianbao Yang

Recent studies have shown that proximal gradient (PG) method and accelerated gradient method (APG) with restarting can enjoy a linear convergence under a weaker condition than strong convexity, namely a quadratic growth condition (QGC).

Paper
Add Code

Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than O(1/\epsilon)

no code implementations • NeurIPS 2016 • Yi Xu, Yan Yan, Qihang Lin, Tianbao Yang

To the best of our knowledge, this is the lowest iteration complexity achieved so far for the considered non-smooth optimization problems without strong convexity assumption.

Paper
Add Code

Efficient Non-oblivious Randomized Reduction for Risk Minimization with Improved Excess Risk Guarantee

no code implementations • 6 Dec 2016 • Yi Xu, Haiqin Yang, Lijun Zhang, Tianbao Yang

Previously, oblivious random projection based approaches that project high dimensional features onto a random subspace have been used in practice for tackling high-dimensionality challenge in machine learning.

BIG-bench Machine Learning

Paper
Add Code

Dynamic Regret of Strongly Adaptive Methods

no code implementations • ICML 2018 • Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou

To cope with changing environments, recent developments in online learning have introduced the concepts of adaptive regret and dynamic regret independently.

Paper
Add Code

Empirical Risk Minimization for Stochastic Convex Optimization: $O(1/n)$- and $O(1/n^2)$-type of Risk Bounds

no code implementations • 7 Feb 2017 • Lijun Zhang, Tianbao Yang, Rong Jin

First, we establish an $\widetilde{O}(d/n + \sqrt{F_*/n})$ risk bound when the random function is nonnegative, convex and smooth, and the expected function is Lipschitz continuous, where $d$ is the dimensionality of the problem, $n$ is the number of samples, and $F_*$ is the minimal risk.

Ranked #5 on Image Classification on Colored-MNIST(with spurious correlation)

Image Classification

Paper
Add Code

Hybrid safe-strong rules for efficient optimization in lasso-type problems

1 code implementation • 27 Apr 2017 • Yaohui Zeng, Tianbao Yang, Patrick Breheny

However, with the ultrahigh-dimensional, large-scale data sets now collected in many real-world applications, it is important to develop algorithms to solve the lasso that efficiently scale up to problems of this size.

Model Selection Vocal Bursts Type Prediction

112

Paper
Code

SEP-Nets: Small and Effective Pattern Networks

no code implementations • 13 Jun 2017 • Zhe Li, Xiaoyu Wang, Xutao Lv, Tianbao Yang

By doing this, we show that previous deep CNNs such as GoogLeNet and Inception-type Nets can be compressed dramatically with marginal drop in performance.

Binarization Quantization

Paper
Add Code

Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence

no code implementations • ICML 2017 • Yi Xu, Qihang Lin, Tianbao Yang

In this paper, a new theory is developed for first-order stochastic convex optimization, showing that the global convergence rate is sufficiently quantified by a local growth rate of the objective function in a neighborhood of the optimal solutions.

Stochastic Optimization

Paper
Add Code

A Simple Analysis for Exp-concave Empirical Minimization with Arbitrary Convex Regularizer

no code implementations • 9 Sep 2017 • Tianbao Yang, Zhe Li, Lijun Zhang

In this paper, we present a simple analysis of {\bf fast rates} with {\it high probability} of {\bf empirical minimization} for {\it stochastic composite optimization} over a finite-dimensional bounded convex set with exponential concave loss functions and an arbitrary convex regularization.

Paper
Add Code

On Noisy Negative Curvature Descent: Competing with Gradient Descent for Faster Non-convex Optimization

no code implementations • 25 Sep 2017 • Mingrui Liu, Tianbao Yang

To the best of our knowledge, the proposed stochastic algorithm is the first one that converges to a second-order stationary point in {\it high probability} with a time complexity independent of the sample size and almost linear in dimensionality.

Paper
Add Code

Stochastic Non-convex Optimization with Strong High Probability Second-order Convergence

no code implementations • 25 Oct 2017 • Mingrui Liu, Tianbao Yang

In this paper, we study stochastic non-convex optimization with non-convex random functions.

Vocal Bursts Intensity Prediction

Paper
Add Code

First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time

no code implementations • NeurIPS 2018 • Yi Xu, Rong Jin, Tianbao Yang

Two classes of methods have been proposed for escaping from saddle points with one using the second-order information carried by the Hessian and the other adding the noise into the first-order information.

Paper
Add Code

Adaptive SVRG Methods under Error Bound Conditions with Unknown Growth Parameter

no code implementations • NeurIPS 2017 • Yi Xu, Qihang Lin, Tianbao Yang

The most studied error bound is the quadratic error bound, which generalizes strong convexity and is satisfied by a large family of machine learning problems.

BIG-bench Machine Learning Stochastic Optimization

Paper
Add Code

Adaptive Accelerated Gradient Converging Method under H\"{o}lderian Error Bound Condition

no code implementations • NeurIPS 2017 • Mingrui Liu, Tianbao Yang

Paper
Add Code

ADMM without a Fixed Penalty Parameter: Faster Convergence with New Adaptive Penalization

no code implementations • NeurIPS 2017 • Yi Xu, Mingrui Liu, Qihang Lin, Tianbao Yang

The novelty of the proposed scheme lies at that it is adaptive to a local sharpness property of the objective function, which marks the key difference from previous adaptive scheme that adjusts the penalty parameter per-iteration based on certain conditions on iterates.

Stochastic Optimization

Paper
Add Code

NEON+: Accelerated Gradient Methods for Extracting Negative Curvature for Non-Convex Optimization

no code implementations • 4 Dec 2017 • Yi Xu, Rong Jin, Tianbao Yang

Accelerated gradient (AG) methods are breakthroughs in convex optimization, improving the convergence rate of the gradient descent method for optimization with smooth functions.

Open-Ended Question Answering

Paper
Add Code

Fast Rates of ERM and Stochastic Approximation: Adaptive to Error Bound Conditions

no code implementations • NeurIPS 2018 • Mingrui Liu, Xiaoxuan Zhang, Lijun Zhang, Rong Jin, Tianbao Yang

Error bound conditions (EBC) are properties that characterize the growth of an objective function when a point is moved away from the optimal set.

Paper
Add Code

Learning with Non-Convex Truncated Losses by SGD

no code implementations • 21 May 2018 • Yi Xu, Shenghuo Zhu, Sen yang, Chi Zhang, Rong Jin, Tianbao Yang

Learning with a {\it convex loss} function has been a dominating paradigm for many years.

Paper
Add Code

An Aggressive Genetic Programming Approach for Searching Neural Network Structure Under Computational Constraints

no code implementations • 3 Jun 2018 • Zhe Li, Xuehan Xiong, Zhou Ren, Ning Zhang, Xiaoyu Wang, Tianbao Yang

In this paper, we study how to design a genetic programming approach for optimizing the structure of a CNN for a given task under limited computational resources yet without imposing strong restrictions on the search space.

Evolutionary Algorithms

Paper
Add Code

EIGEN: Ecologically-Inspired GENetic Approach for Neural Network Structure Searching from Scratch

no code implementations • CVPR 2019 • Jian Ren, Zhe Li, Jianchao Yang, Ning Xu, Tianbao Yang, David J. Foran

In this paper, we propose an Ecologically-Inspired GENetic (EIGEN) approach that uses the concept of succession, extinction, mimicry, and gene duplication to search neural network structure from scratch with poorly initialized simple network and few constraints forced during the evolution, as we assume no prior knowledge about the task domain.

Paper
Add Code

Level-Set Methods for Finite-Sum Constrained Convex Optimization

no code implementations • ICML 2018 • Qihang Lin, Runchao Ma, Tianbao Yang

To update the level parameter towards the optimality, both methods require an oracle that generates upper and lower bounds as well as an affine-minorant of the level function.

Paper
Add Code

Fast Stochastic AUC Maximization with $O(1/n)$-Convergence Rate

no code implementations • ICML 2018 • Mingrui Liu, Xiaoxuan Zhang, Zaiyi Chen, Xiaoyu Wang, Tianbao Yang

In this paper, we consider statistical learning with AUC (area under ROC curve) maximization in the classical stochastic setting where one random data drawn from an unknown distribution is revealed at each iteration for updating the model.

Paper
Add Code

SADAGRAD: Strongly Adaptive Stochastic Gradient Methods

no code implementations • ICML 2018 • Zaiyi Chen, Yi Xu, Enhong Chen, Tianbao Yang

Although the convergence rates of existing variants of ADAGRAD have a better dependence on the number of iterations under the strong convexity condition, their iteration complexities have a explicitly linear dependence on the dimensionality of the problem.

Paper
Add Code

How Local is the Local Diversity? Reinforcing Sequential Determinantal Point Processes with Dynamic Ground Sets for Supervised Video Summarization

no code implementations • ECCV 2018 • Yandong Li, Liqiang Wang, Tianbao Yang, Boqing Gong

The large volume of video content and high viewing frequency demand automatic video summarization algorithms, of which a key property is the capability of modeling diversity.

Point Processes Supervised Video Summarization

Paper
Add Code

Improving Sequential Determinantal Point Processes for Supervised Video Summarization

no code implementations • ECCV 2018 • Aidean Sharghi, Ali Borji, Chengtao Li, Tianbao Yang, Boqing Gong

In terms of modeling, we design a new probabilistic distribution such that, when it is integrated into SeqDPP, the resulting model accepts user input about the expected length of the summary.

Point Processes Supervised Video Summarization

Paper
Add Code

Universal Stagewise Learning for Non-Convex Problems with Convergence on Averaged Solutions

no code implementations • ICLR 2019 • Zaiyi Chen, Zhuoning Yuan, Jin-Feng Yi, Bo-Wen Zhou, Enhong Chen, Tianbao Yang

For example, there is still a lack of theories of convergence for SGD and its variants that use stagewise step size and return an averaged solution in practice.

Paper
Add Code

A Unified Analysis of Stochastic Momentum Methods for Deep Learning

no code implementations • 30 Aug 2018 • Yan Yan, Tianbao Yang, Zhe Li, Qihang Lin, Yi Yang

However, their theoretical analysis of convergence of the training objective and the generalization error for prediction is still under-explored.

Paper
Add Code

Learning Discriminators as Energy Networks in Adversarial Learning

no code implementations • ICLR 2019 • Pingbo Pan, Yan Yan, Tianbao Yang, Yi Yang

In this work, we propose to refine the predictions of structured prediction models by effectively integrating discriminative models into the prediction.

Image Segmentation Multi-Label Classification +2

Paper
Add Code

Weakly-Convex Concave Min-Max Optimization: Provable Algorithms and Applications in Machine Learning

no code implementations • 4 Oct 2018 • Hassan Rafique, Mingrui Liu, Qihang Lin, Tianbao Yang

Min-max problems have broad applications in machine learning, including learning with non-decomposable loss and learning with robustness to data distribution.

BIG-bench Machine Learning

Paper
Add Code

First-order Convergence Theory for Weakly-Convex-Weakly-Concave Min-max Problems

no code implementations • 24 Oct 2018 • Mingrui Liu, Hassan Rafique, Qihang Lin, Tianbao Yang

In this paper, we consider first-order convergence theory and algorithms for solving a class of non-convex non-concave min-max saddle-point problems, whose objective function is weakly convex in the variables of minimization and weakly concave in the variables of maximization.

Paper
Add Code

Stochastic Optimization for DC Functions and Non-smooth Non-convex Regularizers with Non-asymptotic Convergence

no code implementations • 28 Nov 2018 • Yi Xu, Qi Qi, Qihang Lin, Rong Jin, Tianbao Yang

In this paper, we propose new stochastic optimization algorithms and study their first-order convergence theories for solving a broad family of DC functions.

Stochastic Optimization

Paper
Add Code

Faster Online Learning of Optimal Threshold for Consistent F-measure Optimization

no code implementations • NeurIPS 2018 • Xiaoxuan Zhang, Mingrui Liu, Xun Zhou, Tianbao Yang

To advance OFO, we propose an efficient online algorithm based on simultaneously learning a posterior probability of class and learning an optimal threshold by minimizing a stochastic strongly convex function with unknown strong convexity parameter.

Paper
Add Code

Adaptive Negative Curvature Descent with Applications in Non-convex Optimization

no code implementations • NeurIPS 2018 • Mingrui Liu, Zhe Li, Xiaoyu Wang, Jin-Feng Yi, Tianbao Yang

Negative curvature descent (NCD) method has been utilized to design deterministic or stochastic algorithms for non-convex optimization aiming at finding second-order stationary points or local minima.

Paper
Add Code

Stagewise Training Accelerates Convergence of Testing Error Over SGD

no code implementations • NeurIPS 2019 • Zhuoning Yuan, Yan Yan, Rong Jin, Tianbao Yang

For convex loss functions and two classes of "nice-behaviored" non-convex objectives that are close to a convex function, we establish faster convergence of stagewise training than the vanilla SGD under the PL condition on both training error and testing error.

Paper
Add Code

Stochastic Primal-Dual Algorithms with Faster Convergence than $O(1/\sqrt{T})$ for Problems without Bilinear Structure

no code implementations • 23 Apr 2019 • Yan Yan, Yi Xu, Qihang Lin, Lijun Zhang, Tianbao Yang

The main contribution of this paper is the design and analysis of new stochastic primal-dual algorithms that use a mixture of stochastic gradient updates and a logarithmic number of deterministic dual updates for solving a family of convex-concave problems with no bilinear structure assumed.

Paper
Add Code

A Data Efficient and Feasible Level Set Method for Stochastic Convex Optimization with Expectation Constraints

no code implementations • 7 Aug 2019 • Qihang Lin, Selvaprabu Nadarajah, Negar Soheili, Tianbao Yang

We design a stochastic feasible level set method (SFLS) for SOECs that has low data complexity and emphasizes feasibility before convergence.

Paper
Add Code

Stochastic Optimization for Non-convex Inf-Projection Problems

no code implementations • ICML 2020 • Yan Yan, Yi Xu, Lijun Zhang, Xiaoyu Wang, Tianbao Yang

In this paper, we study a family of non-convex and possibly non-smooth inf-projection minimization problems, where the target objective function is equal to minimization of a joint function over another variable.

Stochastic Optimization

Paper
Add Code

Stochastic AUC Maximization with Deep Neural Networks

no code implementations • ICLR 2020 • Mingrui Liu, Zhuoning Yuan, Yiming Ying, Tianbao Yang

In this paper, we consider stochastic AUC maximization problem with a deep neural network as the predictive model.

Paper
Add Code

Improved Schemes for Episodic Memory-based Lifelong Learning

1 code implementation • NeurIPS 2020 • Yunhui Guo, Mingrui Liu, Tianbao Yang, Tajana Rosing

This view leads to two improved schemes for episodic memory based lifelong learning, called MEGA-I and MEGA-II.

Paper
Code

Learning with Long-term Remembering: Following the Lead of Mixed Stochastic Gradient

no code implementations • 25 Sep 2019 • Yunhui Guo, Mingrui Liu, Tianbao Yang, Tajana Rosing

In this paper, we introduce a novel and effective lifelong learning algorithm, called MixEd stochastic GrAdient (MEGA), which allows deep neural networks to acquire the ability of retaining performance on old tasks while learning new tasks.

Paper
Add Code

A Decentralized Parallel Algorithm for Training Generative Adversarial Nets

no code implementations • NeurIPS 2020 • Mingrui Liu, Wei zhang, Youssef Mroueh, Xiaodong Cui, Jerret Ross, Tianbao Yang, Payel Das

Despite recent progress on decentralized algorithms for training deep neural networks, it remains unclear whether it is possible to train GANs in a decentralized manner.

Paper
Add Code

A Simple and Effective Framework for Pairwise Deep Metric Learning

1 code implementation • ECCV 2020 • Qi Qi, Yan Yan, Xiaoyu Wang, Tianbao Yang

To tackle this issue, we propose a simple and effective framework to sample pairs in a batch of data for updating the model.

Binary Classification Metric Learning

Paper
Code

Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets

no code implementations • ICLR 2020 • Mingrui Liu, Youssef Mroueh, Jerret Ross, Wei zhang, Xiaodong Cui, Payel Das, Tianbao Yang

Then we propose an adaptive variant of OSG named Optimistic Adagrad (OAdagrad) and reveal an \emph{improved} adaptive complexity $O\left(\epsilon^{-\frac{2}{1-\alpha}}\right)$, where $\alpha$ characterizes the growth rate of the cumulative stochastic gradient and $0\leq \alpha\leq 1/2$.

Paper
Add Code

Attacking Lifelong Learning Models with Gradient Reversion

no code implementations • ICLR 2020 • Yunhui Guo, Mingrui Liu, Yandong Li, Liqiang Wang, Tianbao Yang, Tajana Rosing

We evaluate the effectiveness of traditional attack methods such as FGSM and PGD. The results show that A-GEM still possesses strong continual learning ability in the presence of adversarial examples in the memory and simple defense techniques such as label smoothing can further alleviate the adversarial effects.

Continual Learning

Paper
Add Code

Minimizing Dynamic Regret and Adaptive Regret Simultaneously

no code implementations • 6 Feb 2020 • Lijun Zhang, Shiyin Lu, Tianbao Yang

To address this limitation, new performance measures, including dynamic regret and adaptive regret have been proposed to guide the design of online algorithms.

Paper
Add Code

Optimal Epoch Stochastic Gradient Descent Ascent Methods for Min-Max Optimization

no code implementations • NeurIPS 2020 • Yan Yan, Yi Xu, Qihang Lin, Wei Liu, Tianbao Yang

In this paper, we bridge this gap by providing a sharp analysis of epoch-wise stochastic gradient descent ascent method (referred to as Epoch-GDA) for solving strongly convex strongly concave (SCSC) min-max problems, without imposing any additional assumption about smoothness or the function's structure.

LEMMA

Paper
Add Code

Revisiting SGD with Increasingly Weighted Averaging: Optimization and Generalization Perspectives

no code implementations • 9 Mar 2020 • Zhishuai Guo, Yan Yan, Tianbao Yang

It remains unclear how these averaging schemes affect the convergence of {\it both optimization error and generalization error} (two equally important components of testing error) for {\bf non-strongly convex objectives, including non-convex problems}.

Paper
Add Code

Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks

1 code implementation • ICML 2020 • Zhishuai Guo, Mingrui Liu, Zhuoning Yuan, Li Shen, Wei Liu, Tianbao Yang

In this paper, we study distributed algorithms for large-scale AUC maximization with a deep neural network as a predictive model.

Distributed Optimization

Paper
Code

Fast Objective & Duality Gap Convergence for Non-Convex Strongly-Concave Min-Max Problems with PL Condition

no code implementations • 12 Jun 2020 • Zhishuai Guo, Yan Yan, Zhuoning Yuan, Tianbao Yang

However, most of the existing algorithms are slow in practice, and their analysis revolves around the convergence to a nearly stationary point. We consider leveraging the Polyak-Lojasiewicz (PL) condition to design faster stochastic algorithms with stronger convergence guarantee.

Paper
Add Code

Nearly Optimal Robust Method for Convex Compositional Problems with Heavy-Tailed Noise

no code implementations • 17 Jun 2020 • Yan Yan, Xin Man, Tianbao Yang

In this paper, we propose robust stochastic algorithms for solving convex compositional problems of the form $f(\E_\xi g(\cdot; \xi)) + r(\cdot)$ by establishing {\bf sub-Gaussian confidence bounds} under weak assumptions about the tails of noise distribution, i. e., {\bf heavy-tailed noise} with bounded second-order moments.

Paper
Add Code

An Online Method for A Class of Distributionally Robust Optimization with Non-Convex Objectives

1 code implementation • NeurIPS 2021 • Qi Qi, Zhishuai Guo, Yi Xu, Rong Jin, Tianbao Yang

In this paper, we propose a practical online method for solving a class of distributionally robust optimization (DRO) with non-convex objectives, which has important applications in machine learning for improving the robustness of neural networks.

Paper
Code

Variance-Reduced Off-Policy Memory-Efficient Policy Search

no code implementations • 14 Sep 2020 • Daoming Lyu, Qi Qi, Mohammad Ghavamzadeh, Hengshuai Yao, Tianbao Yang, Bo Liu

To achieve variance-reduced off-policy-stable policy optimization, we propose an algorithm family that is memory-efficient, stochastically variance-reduced, and capable of learning from off-policy samples.

Reinforcement Learning (RL) Stochastic Optimization

Paper
Add Code

Adam$^+$: A Stochastic Method with Adaptive Variance Reduction

no code implementations • 24 Nov 2020 • Mingrui Liu, Wei zhang, Francesco Orabona, Tianbao Yang

As a result, Adam$^+$ requires few parameter tuning, as Adam, but it enjoys a provable convergence guarantee.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Advanced Graph and Sequence Neural Networks for Molecular Property Prediction and Drug Discovery

1 code implementation • 2 Dec 2020 • Zhengyang Wang, Meng Liu, Youzhi Luo, Zhao Xu, Yaochen Xie, Limei Wang, Lei Cai, Qi Qi, Zhuoning Yuan, Tianbao Yang, Shuiwang Ji

Here we develop a suite of comprehensive machine learning methods and tools spanning different computational models, molecular representations, and loss functions for molecular property prediction and drug discovery.

BIG-bench Machine Learning Drug Discovery +2

154

Paper
Code

Large-scale Robust Deep AUC Maximization: A New Surrogate Loss and Empirical Studies on Medical Image Classification

4 code implementations • ICCV 2021 • Zhuoning Yuan, Yan Yan, Milan Sonka, Tianbao Yang

Our studies demonstrate that the proposed DAM method improves the performance of optimizing cross-entropy loss by a large margin, and also achieves better performance than optimizing the existing AUC square loss on these medical image classification tasks.

Ranked #2 on Multi-Label Classification on CheXpert

General Classification Graph Property Prediction +3

270

Paper
Code

Attentional-Biased Stochastic Gradient Descent

1 code implementation • 13 Dec 2020 • Qi Qi, Yi Xu, Rong Jin, Wotao Yin, Tianbao Yang

In this paper, we present a simple yet effective provable method (named ABSGD) for addressing the data imbalance or label noise problem in deep learning.

Classification General Classification +2

Paper
Code

Federated Deep AUC Maximization for Heterogeneous Data with a Constant Communication Complexity

1 code implementation • 9 Feb 2021 • Zhuoning Yuan, Zhishuai Guo, Yi Xu, Yiming Ying, Tianbao Yang

Deep AUC (area under the ROC curve) Maximization (DAM) has attracted much attention recently due to its great potential for imbalanced data classification.

Federated Learning

Paper
Code

Revisiting Smoothed Online Learning

no code implementations • NeurIPS 2021 • Lijun Zhang, Wei Jiang, Shiyin Lu, Tianbao Yang

Moreover, when the hitting cost is both convex and $\lambda$-quadratic growth, we reduce the competitive ratio to $1 + \frac{2}{\sqrt{\lambda}}$ by minimizing the weighted sum of the hitting cost and the switching cost.

Paper
Add Code

Online Convex Optimization with Continuous Switching Constraint

no code implementations • NeurIPS 2021 • Guanghui Wang, Yuanyu Wan, Tianbao Yang, Lijun Zhang

To control the switching cost, we introduce the problem of online convex optimization with continuous switching constraint, where the goal is to achieve a small regret given a budget on the \emph{overall} switching cost.

Decision Making

Paper
Add Code

Stochastic Optimization of Areas Under Precision-Recall Curves with Provable Convergence

1 code implementation • NeurIPS 2021 • Qi Qi, Youzhi Luo, Zhao Xu, Shuiwang Ji, Tianbao Yang

Compared with AUROC, AUPRC is a more appropriate metric for highly imbalanced datasets.

Stochastic Optimization

Paper
Code

A Novel Convergence Analysis for Algorithms of the Adam Family and Beyond

no code implementations • 30 Apr 2021 • Zhishuai Guo, Yi Xu, Wotao Yin, Rong Jin, Tianbao Yang

Our analysis exhibits that an increasing or large enough "momentum" parameter for the first-order moment used in practice is sufficient to ensure Adam and its many variants converge under a mild boundness condition on the adaptive scaling factor of the step size.

Bilevel Optimization

Paper
Add Code

Randomized Stochastic Variance-Reduced Methods for Multi-Task Stochastic Bilevel Optimization

no code implementations • 5 May 2021 • Zhishuai Guo, Quanqi Hu, Lijun Zhang, Tianbao Yang

Although numerous studies have proposed stochastic algorithms for solving these problems, they are limited in two perspectives: (i) their sample complexities are high, which do not match the state-of-the-art result for non-convex stochastic optimization; (ii) their algorithms are tailored to problems with only one lower-level problem.

Bilevel Optimization Stochastic Optimization

Paper
Add Code

Stability and Generalization of Stochastic Gradient Methods for Minimax Problems

1 code implementation • 8 May 2021 • Yunwen Lei, Zhenhuan Yang, Tianbao Yang, Yiming Ying

In this paper, we provide a comprehensive generalization analysis of stochastic gradient methods for minimax problems under both convex-concave and nonconvex-nonconcave cases through the lens of algorithmic stability.

Generalization Bounds

Paper
Code

A Simple yet Universal Strategy for Online Convex Optimization

no code implementations • 8 May 2021 • Lijun Zhang, Guanghui Wang, JinFeng Yi, Tianbao Yang

In this paper, we propose a simple strategy for universal online convex optimization, which avoids these limitations.

Paper
Add Code

Memory-Based Optimization Methods for Model-Agnostic Meta-Learning and Personalized Federated Learning

1 code implementation • 9 Jun 2021 • Bokun Wang, Zhuoning Yuan, Yiming Ying, Tianbao Yang

The proposed algorithms require sampling a constant number of tasks and data samples per iteration, making them suitable for the continual learning scenario.

Continual Learning Meta-Learning +2

Paper
Code

Momentum Accelerates the Convergence of Stochastic AUPRC Maximization

no code implementations • 2 Jul 2021 • Guanghui Wang, Ming Yang, Lijun Zhang, Tianbao Yang

In this paper, we further improve the stochastic optimization of AURPC by (i) developing novel stochastic momentum methods with a better iteration complexity of $O(1/\epsilon^4)$ for finding an $\epsilon$-stationary solution; and (ii) designing a novel family of stochastic adaptive methods with the same iteration complexity, which enjoy faster convergence in practice.

imbalanced classification Stochastic Optimization

Paper
Add Code

Compositional Training for End-to-End Deep AUC Maximization

no code implementations • ICLR 2022 • Zhuoning Yuan, Zhishuai Guo, Nitesh Chawla, Tianbao Yang

The key idea of compositional training is to minimize a compositional objective function, where the outer function corresponds to an AUC loss and the inner function represents a gradient descent step for minimizing a traditional loss, e. g., the cross-entropy (CE) loss.

Image Classification Medical Image Classification +1

Paper
Add Code

Deep AUC Maximization for Medical Image Classification: Challenges and Opportunities

no code implementations • 1 Nov 2021 • Tianbao Yang

In this extended abstract, we will present and discuss opportunities and challenges brought about by a new deep learning method by AUC maximization (aka \underline{\bf D}eep \underline{\bf A}UC \underline{\bf M}aximization or {\bf DAM}) for medical image classification.

Image Classification Medical Image Classification

Paper
Add Code

Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise Learning

1 code implementation • 23 Nov 2021 • Zhenhuan Yang, Yunwen Lei, Puyu Wang, Tianbao Yang, Yiming Ying

A popular approach to handle streaming data in pairwise learning is an online gradient descent (OGD) algorithm, where one needs to pair the current instance with a buffering set of previous instances with a sufficiently large size and therefore suffers from a scalability issue.

Generalization Bounds Metric Learning +1

Paper
Code

Simple Stochastic and Online Gradient Descent Algorithms for Pairwise Learning

no code implementations • NeurIPS 2021 • Zhenhuan Yang, Yunwen Lei, Puyu Wang, Tianbao Yang, Yiming Ying

Generalization Bounds Metric Learning +1

Paper
Add Code

A Novel Convergence Analysis for Algorithms of the Adam Family

no code implementations • 7 Dec 2021 • Zhishuai Guo, Yi Xu, Wotao Yin, Rong Jin, Tianbao Yang

Although rigorous convergence analysis exists for Adam, they impose specific requirements on the update of the adaptive step size, which are not generic enough to cover many other variants of Adam.

Bilevel Optimization

Paper
Add Code

Label Distributionally Robust Losses for Multi-class Classification: Consistency, Robustness and Adaptivity

1 code implementation • 30 Dec 2021 • Dixian Zhu, Yiming Ying, Tianbao Yang

We study a family of loss functions named label-distributionally robust (LDR) losses for multi-class classification that are formulated from distributionally robust optimization (DRO) perspective, where the uncertainty in the given label information are modeled and captured by taking the worse case of distributional weights.

Classification Consistency Multi-class Classification

Paper
Code

Optimal Algorithms for Stochastic Multi-Level Compositional Optimization

no code implementations • 15 Feb 2022 • Wei Jiang, Bokun Wang, Yibo Wang, Lijun Zhang, Tianbao Yang

To address these limitations, we propose a Stochastic Multi-level Variance Reduction method (SMVR), which achieves the optimal sample complexity of $\mathcal{O}\left(1 / \epsilon^{3}\right)$ to find an $\epsilon$-stationary point for non-convex objectives.

Paper
Add Code

Finite-Sum Coupled Compositional Stochastic Optimization: Theory and Applications

no code implementations • 24 Feb 2022 • Bokun Wang, Tianbao Yang

This paper studies stochastic optimization for a sum of compositional functions, where the inner-level function of each summand is coupled with the corresponding summation index.

Meta-Learning Stochastic Optimization +1

Paper
Add Code

Large-scale Stochastic Optimization of NDCG Surrogates for Deep Learning with Provable Convergence

1 code implementation • 24 Feb 2022 • Zi-Hao Qiu, Quanqi Hu, Yongjian Zhong, Lijun Zhang, Tianbao Yang

To the best of our knowledge, this is the first time that stochastic algorithms are proposed to optimize NDCG with a provable convergence guarantee.

Information Retrieval Retrieval +1

Paper
Code

Provable Stochastic Optimization for Global Contrastive Learning: Small Batch Does Not Harm Performance

1 code implementation • 24 Feb 2022 • Zhuoning Yuan, Yuexin Wu, Zi-Hao Qiu, Xianzhi Du, Lijun Zhang, Denny Zhou, Tianbao Yang

In this paper, we study contrastive learning from an optimization perspective, aiming to analyze and address a fundamental issue of existing contrastive learning methods that either rely on a large batch size or a large dictionary of feature vectors.

Contrastive Learning Self-Supervised Learning +1

Paper
Code

When AUC meets DRO: Optimizing Partial AUC for Deep Learning with Non-Convex Convergence Guarantee

no code implementations • 1 Mar 2022 • Dixian Zhu, Gang Li, Bokun Wang, Xiaodong Wu, Tianbao Yang

In this paper, we propose systematic and efficient gradient-based methods for both one-way and two-way partial AUC (pAUC) maximization that are applicable to deep learning.

Paper
Add Code

Large-scale Optimization of Partial AUC in a Range of False Positive Rates

no code implementations • 3 Mar 2022 • Yao Yao, Qihang Lin, Tianbao Yang

The partial AUC, as a generalization of the AUC, summarizes only the TPRs over a specific range of the FPRs and is thus a more suitable performance measure in many real-world situations.

Paper
Add Code

Benchmarking Deep AUROC Optimization: Loss Functions and Algorithmic Choices

no code implementations • 27 Mar 2022 • Dixian Zhu, Xiaodong Wu, Tianbao Yang

(i) We benchmark a variety of loss functions with different algorithmic choices for deep AUROC optimization problem.

Benchmarking imbalanced classification

Paper
Add Code

AUC Maximization in the Era of Big Data and AI: A Survey

no code implementations • 28 Mar 2022 • Tianbao Yang, Yiming Ying

We also identify and discuss remaining and emerging issues for deep AUC maximization, and provide suggestions on topics for future work.

Paper
Add Code

Smoothed Online Convex Optimization Based on Discounted-Normal-Predictor

no code implementations • 2 May 2022 • Lijun Zhang, Wei Jiang, JinFeng Yi, Tianbao Yang

In this paper, we investigate an online prediction strategy named as Discounted-Normal-Predictor (Kapralov and Panigrahy, 2010) for smoothed online convex optimization (SOCO), in which the learner needs to minimize not only the hitting cost but also the switching cost.

Paper
Add Code

Algorithmic Foundations of Empirical X-risk Minimization

no code implementations • 1 Jun 2022 • Tianbao Yang

This manuscript introduces a new optimization framework for machine learning and AI, named {\bf empirical X-risk minimization (EXM)}.

Bilevel Optimization Information Retrieval +1

Paper
Add Code

Multi-block Min-max Bilevel Optimization with Applications in Multi-task Deep AUC Maximization

no code implementations • 1 Jun 2022 • Quanqi Hu, Yongjian Zhong, Tianbao Yang

To tackle this challenge, we present a single-loop randomized stochastic algorithm, which requires updates for only a constant number of blocks at each iteration.

Bilevel Optimization

Paper
Add Code

GraphFM: Improving Large-Scale GNN Training via Feature Momentum

1 code implementation • 14 Jun 2022 • Haiyang Yu, Limei Wang, Bokun Wang, Meng Liu, Tianbao Yang, Shuiwang Ji

GraphFM-IB applies FM to in-batch sampled data, while GraphFM-OB applies FM to out-of-batch data that are 1-hop neighborhood of in-batch data.

Node Classification

1,770

Paper
Code

Multi-block-Single-probe Variance Reduced Estimator for Coupled Compositional Optimization

no code implementations • 18 Jul 2022 • Wei Jiang, Gang Li, Yibo Wang, Lijun Zhang, Tianbao Yang

The key issue is to track and estimate a sequence of $\mathbf g(\mathbf{w})=(g_1(\mathbf{w}), \ldots, g_m(\mathbf{w}))$ across iterations, where $\mathbf g(\mathbf{w})$ has $m$ blocks and it is only allowed to probe $\mathcal{O}(1)$ blocks to attain their stochastic values and Jacobians.

Paper
Add Code

Stochastic Constrained DRO with a Complexity Independent of Sample Size

no code implementations • 11 Oct 2022 • Qi Qi, Jiameng Lyu, Kung sik Chan, Er Wei Bai, Tianbao Yang

Distributionally Robust Optimization (DRO), as a popular method to train robust models against distribution shift between training and test sets, has received tremendous attention in recent years.

Paper
Add Code

Fairness via Adversarial Attribute Neighbourhood Robust Learning

no code implementations • 12 Oct 2022 • Qi Qi, Shervin Ardeshir, Yi Xu, Tianbao Yang

Improving fairness between privileged and less-privileged sensitive attribute groups (e. g, {race, gender}) has attracted lots of attention.

Attribute Fairness

Paper
Add Code

FeDXL: Provable Federated Learning for Deep X-Risk Optimization

1 code implementation • 26 Oct 2022 • Zhishuai Guo, Rong Jin, Jiebo Luo, Tianbao Yang

To this end, we propose an active-passive decomposition framework that decouples the gradient's components with two types, namely active parts and passive parts, where the active parts depend on local data that are computed with the local model and the passive parts depend on other machines that are communicated/computed based on historical models and samples.

Federated Learning

Paper
Code

Stochastic Methods for AUC Optimization subject to AUC-based Fairness Constraints

no code implementations • 23 Dec 2022 • Yao Yao, Qihang Lin, Tianbao Yang

In this work, we formulate the training problem of a fairness-aware machine learning model as an AUC optimization problem subject to a class of AUC-based fairness constraints.

Fairness

Paper
Add Code

Stochastic Approximation Approaches to Group Distributionally Robust Optimization

no code implementations • NeurIPS 2023 • Lijun Zhang, Peng Zhao, Zhen-Hua Zhuang, Tianbao Yang, Zhi-Hua Zhou

First, we formulate GDRO as a stochastic convex-concave saddle-point problem, and demonstrate that stochastic mirror descent (SMD), using $m$ samples in each iteration, achieves an $O(m (\log m)/\epsilon^2)$ sample complexity for finding an $\epsilon$-optimal solution, which matches the $\Omega(m/\epsilon^2)$ lower bound up to a logarithmic factor.

Multi-Armed Bandits

Paper
Add Code

Generalization Analysis for Contrastive Representation Learning

no code implementations • 24 Feb 2023 • Yunwen Lei, Tianbao Yang, Yiming Ying, Ding-Xuan Zhou

For self-bounding Lipschitz loss functions, we further improve our results by developing optimistic bounds which imply fast rates in a low noise condition.

Contrastive Learning Generalization Bounds +1

Paper
Add Code

Provable Multi-instance Deep AUC Maximization with Stochastic Pooling

1 code implementation • 14 May 2023 • Dixian Zhu, Bokun Wang, Zhi Chen, Yaxing Wang, Milan Sonka, Xiaodong Wu, Tianbao Yang

This paper considers a novel application of deep AUC maximization (DAM) for multi-instance learning (MIL), in which a single class label is assigned to a bag of instances (e. g., multiple 2D slices of a CT scan for a patient).

Stochastic Optimization

Paper
Code

Not All Semantics are Created Equal: Contrastive Self-supervised Learning with Automatic Temperature Individualization

1 code implementation • 19 May 2023 • Zi-Hao Qiu, Quanqi Hu, Zhuoning Yuan, Denny Zhou, Lijun Zhang, Tianbao Yang

In this paper, we aim to optimize a contrastive loss with individualized temperatures in a principled and systematic manner for self-supervised learning.

Self-Supervised Learning

Paper
Code

Blockwise Stochastic Variance-Reduced Methods with Parallel Speedup for Multi-Block Bilevel Optimization

1 code implementation • 30 May 2023 • Quanqi Hu, Zi-Hao Qiu, Zhishuai Guo, Lijun Zhang, Tianbao Yang

In this paper, we consider non-convex multi-block bilevel optimization (MBBO) problems, which involve $m\gg 1$ lower level problems and have important applications in machine learning.

Bilevel Optimization

Paper
Code

LibAUC: A Deep Learning Library for X-Risk Optimization

1 code implementation • 5 Jun 2023 • Zhuoning Yuan, Dixian Zhu, Zi-Hao Qiu, Gang Li, Xuanhui Wang, Tianbao Yang

This paper introduces the award-winning deep learning (DL) library called LibAUC for implementing state-of-the-art algorithms towards optimizing a family of risk functions named X-risks.

Benchmarking Classification +2

270

Paper
Code

Learning Unnormalized Statistical Models via Compositional Optimization

no code implementations • 13 Jun 2023 • Wei Jiang, Jiayu Qin, Lingyu Wu, Changyou Chen, Tianbao Yang, Lijun Zhang

Learning unnormalized statistical models (e. g., energy-based models) is computationally challenging due to the complexity of handling the partition function.

Density Estimation Image Generation +1

Paper
Add Code

Stability and Generalization of Stochastic Compositional Gradient Descent Algorithms

no code implementations • 7 Jul 2023 • Ming Yang, Xiyuan Wei, Tianbao Yang, Yiming Ying

Then, we establish the compositional uniform stability results for two popular stochastic compositional gradient descent algorithms, namely SCGD and SCSC.

Learning Theory Meta-Learning

Paper
Add Code

Everything Perturbed All at Once: Enabling Differentiable Graph Attacks

no code implementations • 29 Aug 2023 • Haoran Liu, Bokun Wang, Jianling Wang, Xiangjue Dong, Tianbao Yang, James Caverlee

As powerful tools for representation learning on graphs, graph neural networks (GNNs) have played an important role in applications including social networks, recommendation systems, and online web services.

Meta-Learning Recommendation Systems +1

Paper
Add Code

Non-Smooth Weakly-Convex Finite-sum Coupled Compositional Optimization

no code implementations • NeurIPS 2023 • Quanqi Hu, Dixian Zhu, Tianbao Yang

This paper investigates new families of compositional optimization problems, called $\underline{\bf n}$on-$\underline{\bf s}$mooth $\underline{\bf w}$eakly-$\underline{\bf c}$onvex $\underline{\bf f}$inite-sum $\underline{\bf c}$oupled $\underline{\bf c}$ompositional $\underline{\bf o}$ptimization (NSWC FCCO).

Paper
Add Code

AUC-mixup: Deep AUC Maximization with Mixup

no code implementations • 18 Oct 2023 • Jianzhi Xv, Gang Li, Tianbao Yang

While deep AUC maximization (DAM) has shown remarkable success on imbalanced medical tasks, e. g., chest X-rays classification and skin lesions classification, it could suffer from severe overfitting when applied to small datasets due to its aggressive nature of pushing prediction scores of positive data away from that of negative data.

Data Augmentation

Paper
Add Code

ALEXR: An Optimal Single-Loop Algorithm for Convex Finite-Sum Coupled Compositional Stochastic Optimization

no code implementations • 4 Dec 2023 • Bokun Wang, Tianbao Yang

This paper revisits a class of convex Finite-Sum Coupled Compositional Stochastic Optimization (cFCCO) problems with many applications, including group distributionally robust optimization (GDRO), learning with imbalanced data, reinforcement learning, and learning to rank.

Learning-To-Rank Stochastic Optimization

Paper
Add Code

Multimodal Pretraining of Medical Time Series and Notes

1 code implementation • 11 Dec 2023 • Ryan King, Tianbao Yang, Bobak Mortazavi

In downstream tasks, including in-hospital mortality prediction and phenotyping, our pretrained model outperforms baselines in settings where only a fraction of the data is labeled, emphasizing its ability to enhance ICU data analysis.

Mortality Prediction Self-Supervised Learning +1

Paper
Code

To Cool or not to Cool? Temperature Network Meets Large Foundation Models via DRO

1 code implementation • 6 Apr 2024 • Zi-Hao Qiu, Siqi Guo, Mao Xu, Tuo Zhao, Lijun Zhang, Tianbao Yang

In this paper, we present a principled framework for learning a small yet generalizable temperature prediction network (TempNet) to improve LFMs.

Paper
Code

Accelerating Deep Learning with Millions of Classes

no code implementations • ECCV 2020 • Zhuoning Yuan, Zhishuai Guo, Xiaotian Yu, Xiaoyu Wang, Tianbao Yang

In our experiment, we demonstrate that the proposed frame-work is able to train deep learning models with millions of classes and achieve above 10×speedup compared to existing approaches.

Classification General Classification +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.