Search Results for author: Quanquan Gu

Found 181 papers, 33 papers with code

Generalized Fisher Score for Feature Selection

1 code implementation • 14 Feb 2012 • Quanquan Gu, Zhenhui Li, Jiawei Han

Fisher score is one of the most widely used supervised feature selection methods.

Paper
Code

Selective Labeling via Error Bound Minimization

no code implementations • NeurIPS 2012 • Quanquan Gu, Tong Zhang, Jiawei Han, Chris H. Ding

In particular, we derive a deterministic generalization error bound for LapRLS trained on subsampled data, and propose to select a subset of data points to label by minimizing this upper bound.

Paper
Add Code

Robust Tensor Decomposition with Gross Corruption

no code implementations • NeurIPS 2014 • Quanquan Gu, Huan Gui, Jiawei Han

In this paper, we study the statistical performance of robust tensor decomposition with gross corruption.

Tensor Decomposition

Paper
Add Code

High Dimensional Expectation-Maximization Algorithm: Statistical Optimization and Asymptotic Normality

no code implementations • 30 Dec 2014 • Zhaoran Wang, Quanquan Gu, Yang Ning, Han Liu

We provide a general theory of the expectation-maximization (EM) algorithm for inferring high dimensional latent variable models.

Vocal Bursts Intensity Prediction

Paper
Add Code

Local and Global Inference for High Dimensional Nonparanormal Graphical Models

no code implementations • 9 Feb 2015 • Quanquan Gu, Yuan Cao, Yang Ning, Han Liu

Due to the presence of unknown marginal transformations, we propose a pseudo likelihood based inferential approach.

Vocal Bursts Intensity Prediction

Paper
Add Code

Statistical Limits of Convex Relaxations

no code implementations • 4 Mar 2015 • Zhaoran Wang, Quanquan Gu, Han Liu

Many high dimensional sparse learning problems are formulated as nonconvex optimization.

Sparse Learning Stochastic Block Model

Paper
Add Code

Towards Faster Rates and Oracle Property for Low-Rank Matrix Estimation

no code implementations • 18 May 2015 • Huan Gui, Quanquan Gu

Moreover, we rigorously show that under a certain condition on the magnitude of the nonzero singular values, the proposed estimator enjoys oracle property (i. e., exactly recovers the true rank of the matrix), besides attaining a faster rate.

Matrix Completion

Paper
Add Code

High Dimensional EM Algorithm: Statistical Optimization and Asymptotic Normality

no code implementations • NeurIPS 2015 • Zhaoran Wang, Quanquan Gu, Yang Ning, Han Liu

We provide a general theory of the expectation-maximization (EM) algorithm for inferring high dimensional latent variable models.

Vocal Bursts Intensity Prediction

Paper
Add Code

Sharp Computational-Statistical Phase Transitions via Oracle Computational Model

no code implementations • 30 Dec 2015 • Zhaoran Wang, Quanquan Gu, Han Liu

Based upon an oracle model of computation, which captures the interactions between algorithms and data, we establish a general lower bound that explicitly connects the minimum testing risk under computational budget constraints with the intrinsic probabilistic and combinatorial structures of statistical problems.

Two-sample testing

Paper
Add Code

High Dimensional Multivariate Regression and Precision Matrix Estimation via Nonconvex Optimization

no code implementations • 2 Jun 2016 • Jinghui Chen, Quanquan Gu

We propose a nonconvex estimator for joint multivariate regression and precision matrix estimation in the high dimensional regime, under sparsity constraints.

regression Vocal Bursts Intensity Prediction

Paper
Add Code

Communication-efficient Distributed Sparse Linear Discriminant Analysis

no code implementations • 15 Oct 2016 • Lu Tian, Quanquan Gu

We propose a communication-efficient distributed estimation method for sparse linear discriminant analysis (LDA) in the high dimensional regime.

Model Selection

Paper
Add Code

A Unified Computational and Statistical Framework for Nonconvex Low-Rank Matrix Estimation

no code implementations • 17 Oct 2016 • Lingxiao Wang, Xiao Zhang, Quanquan Gu

In the general case with noisy observations, we show that our algorithm is guaranteed to linearly converge to the unknown low-rank matrix up to minimax optimal statistical error, provided an appropriate initial estimator.

Matrix Completion

Paper
Add Code

Semiparametric Differential Graph Models

no code implementations • NeurIPS 2016 • Pan Xu, Quanquan Gu

In many cases of network analysis, it is more attractive to study how a network varies under different conditions than an individual static network.

Paper
Add Code

Communication-efficient Distributed Estimation and Inference for Transelliptical Graphical Models

no code implementations • 29 Dec 2016 • Pan Xu, Lu Tian, Quanquan Gu

In detail, the proposed method distributes the $d$-dimensional data of size $N$ generated from a transelliptical graphical model into $m$ worker machines, and estimates the latent precision matrix on each worker machine based on the data of size $n=N/m$.

Paper
Add Code

Stochastic Variance-reduced Gradient Descent for Low-rank Matrix Recovery from Linear Measurements

no code implementations • 2 Jan 2017 • Xiao Zhang, Lingxiao Wang, Quanquan Gu

And in the noiseless setting, our algorithm is guaranteed to linearly converge to the unknown low-rank matrix and achieves exact recovery with optimal sample complexity.

Paper
Add Code

A Universal Variance Reduction-Based Catalyst for Nonconvex Low-Rank Matrix Recovery

no code implementations • 9 Jan 2017 • Lingxiao Wang, Xiao Zhang, Quanquan Gu

We propose a generic framework based on a new stochastic variance-reduced gradient descent algorithm for accelerating nonconvex low-rank matrix recovery.

Paper
Add Code

A Unified Framework for Low-Rank plus Sparse Matrix Recovery

no code implementations • 21 Feb 2017 • Xiao Zhang, Lingxiao Wang, Quanquan Gu

We propose a unified framework to solve general low-rank plus sparse matrix recovery problems based on matrix factorization, which covers a broad family of objective functions satisfying the restricted strong convexity and smoothness conditions.

Paper
Add Code

Speeding Up Latent Variable Gaussian Graphical Model Estimation via Nonconvex Optimizations

no code implementations • NeurIPS 2017 • Pan Xu, Jian Ma, Quanquan Gu

In order to speed up the estimation of the sparse plus low-rank components, we propose a sparsity constrained maximum likelihood estimator based on matrix factorization, and an efficient alternating gradient descent algorithm with hard thresholding to solve it.

Paper
Add Code

Robust Wirtinger Flow for Phase Retrieval with Arbitrary Corruption

no code implementations • 20 Apr 2017 • Jinghui Chen, Lingxiao Wang, Xiao Zhang, Quanquan Gu

We consider the robust phase retrieval problem of recovering the unknown signal from the magnitude-only measurements, where the measurements can be contaminated by both sparse arbitrary corruption and bounded random noise.

Retrieval

Paper
Add Code

Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization

no code implementations • NeurIPS 2018 • Pan Xu, Jinghui Chen, Difan Zou, Quanquan Gu

Furthermore, for the first time we prove the global convergence guarantee for variance reduced stochastic gradient Langevin dynamics (SVRG-LD) to the almost minimizer within $\tilde O\big(\sqrt{n}d^5/(\lambda^4\epsilon^{5/2})\big)$ stochastic gradient evaluations, which outperforms the gradient complexities of GLD and SGLD in a wide regime.

Paper
Add Code

A Unified Variance Reduction-Based Framework for Nonconvex Low-Rank Matrix Recovery

no code implementations • ICML 2017 • Lingxiao Wang, Xiao Zhang, Quanquan Gu

We propose a generic framework based on a new stochastic variance-reduced gradient descent algorithm for accelerating nonconvex low-rank matrix recovery.

Paper
Add Code

Robust Gaussian Graphical Model Estimation with Arbitrary Corruption

no code implementations • ICML 2017 • Lingxiao Wang, Quanquan Gu

In particular, we show that provided that the number of corrupted samples $n_2$ for each variable satisfies $n_2 \lesssim \sqrt{n}/\sqrt{\log d}$, where $n$ is the sample size and $d$ is the number of variables, the proposed robust precision matrix estimator attains the same statistical rate as the standard estimator for Gaussian graphical models.

Model Selection Two-sample testing

Paper
Add Code

High-Dimensional Variance-Reduced Stochastic Gradient Expectation-Maximization Algorithm

no code implementations • ICML 2017 • Rongda Zhu, Lingxiao Wang, ChengXiang Zhai, Quanquan Gu

We apply our generic algorithm to two illustrative latent variable models: Gaussian mixture model and mixture of linear regression, and demonstrate the advantages of our algorithm by both theoretical analysis and numerical experiments.

Vocal Bursts Intensity Prediction

Paper
Add Code

Uncertainty Assessment and False Discovery Rate Control in High-Dimensional Granger Causal Inference

no code implementations • ICML 2017 • Aditya Chaudhry, Pan Xu, Quanquan Gu

Causal inference among high-dimensional time series data proves an important research problem in many fields.

Causal Inference Time Series +1

Paper
Add Code

Speeding Up Latent Variable Gaussian Graphical Model Estimation via Nonconvex Optimization

no code implementations • NeurIPS 2017 • Pan Xu, Jian Ma, Quanquan Gu

In order to speed up the estimation of the sparse plus low-rank components, we propose a sparsity constrained maximum likelihood estimator based on matrix factorization and an efficient alternating gradient descent algorithm with hard thresholding to solve it.

Paper
Add Code

Saving Gradient and Negative Curvature Computations: Finding Local Minima More Efficiently

no code implementations • 11 Dec 2017 • Yaodong Yu, Difan Zou, Quanquan Gu

We propose a family of nonconvex optimization algorithms that are able to save gradient and negative curvature computations to a large extent, and are guaranteed to find an approximate local minimum with improved runtime complexity.

Paper
Add Code

Third-order Smoothness Helps: Even Faster Stochastic Optimization Algorithms for Finding Local Minima

no code implementations • 18 Dec 2017 • Yaodong Yu, Pan Xu, Quanquan Gu

We propose stochastic optimization algorithms that can find local minima faster than existing algorithms for nonconvex optimization problems, by exploiting the third-order smoothness to escape non-degenerate saddle points more efficiently.

Stochastic Optimization

Paper
Add Code

Stochastic Variance-Reduced Hamilton Monte Carlo Methods

no code implementations • ICML 2018 • Difan Zou, Pan Xu, Quanquan Gu

We propose a fast stochastic Hamilton Monte Carlo (HMC) method, for sampling from a smooth and strongly log-concave distribution.

Stochastic Optimization

Paper
Add Code

Stochastic Variance-Reduced Cubic Regularized Newton Method

no code implementations • ICML 2018 • Dongruo Zhou, Pan Xu, Quanquan Gu

At the core of our algorithm is a novel semi-stochastic gradient along with a semi-stochastic Hessian, which are specifically designed for cubic regularization method.

Paper
Add Code

Fast and Sample Efficient Inductive Matrix Completion via Multi-Phase Procrustes Flow

1 code implementation • ICML 2018 • Xiao Zhang, Simon S. Du, Quanquan Gu

We revisit the inductive matrix completion problem that aims to recover a rank-$r$ matrix with ambient dimension $d$ given $n$ features as the side prior information.

Matrix Completion

Paper
Code

Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks

2 code implementations • 18 Jun 2018 • Jinghui Chen, Dongruo Zhou, Yiqi Tang, Ziyan Yang, Yuan Cao, Quanquan Gu

Experiments on standard benchmarks show that our proposed algorithm can maintain a fast convergence rate as Adam/Amsgrad while generalizing as well as SGD in training deep neural networks.

Paper
Code

Learning One-hidden-layer ReLU Networks via Gradient Descent

no code implementations • 20 Jun 2018 • Xiao Zhang, Yaodong Yu, Lingxiao Wang, Quanquan Gu

We study the problem of learning one-hidden-layer neural networks with Rectified Linear Unit (ReLU) activation function, where the inputs are sampled from standard Gaussian distribution and the outputs are generated from a noisy teacher network.

Paper
Add Code

Stochastic Nested Variance Reduction for Nonconvex Optimization

no code implementations • NeurIPS 2018 • Dongruo Zhou, Pan Xu, Quanquan Gu

We study finite-sum nonconvex optimization problems, where the objective function is an average of $n$ nonconvex functions.

Paper
Add Code

Finding Local Minima via Stochastic Nested Variance Reduction

no code implementations • 22 Jun 2018 • Dongruo Zhou, Pan Xu, Quanquan Gu

For general stochastic optimization problems, the proposed $\text{SNVRG}^{+}+\text{Neon2}^{\text{online}}$ achieves $\tilde{O}(\epsilon^{-3}+\epsilon_H^{-5}+\epsilon^{-2}\epsilon_H^{-3})$ gradient complexity, which is better than both $\text{SVRG}+\text{Neon2}^{\text{online}}$ (Allen-Zhu and Li, 2017) and Natasha2 (Allen-Zhu, 2017) in certain regimes.

Stochastic Optimization

Paper
Add Code

A Primal-Dual Analysis of Global Optimality in Nonconvex Low-Rank Matrix Recovery

no code implementations • ICML 2018 • Xiao Zhang, Lingxiao Wang, Yaodong Yu, Quanquan Gu

We propose a primal-dual based framework for analyzing the global optimality of nonconvex low-rank matrix recovery.

Matrix Completion

Paper
Add Code

Covariate Adjusted Precision Matrix Estimation via Nonconvex Optimization

no code implementations • ICML 2018 • Jinghui Chen, Pan Xu, Lingxiao Wang, Jian Ma, Quanquan Gu

We propose a nonconvex estimator for the covariate adjusted precision matrix estimation problem in the high dimensional regime, under sparsity constraints.

Paper
Add Code

Continuous and Discrete-time Accelerated Stochastic Mirror Descent for Strongly Convex Functions

no code implementations • ICML 2018 • Pan Xu, Tianhao Wang, Quanquan Gu

We provide a second-order stochastic differential equation (SDE), which characterizes the continuous-time dynamics of accelerated stochastic mirror descent (ASMD) for strongly convex functions.

Stochastic Optimization

Paper
Add Code

On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization

no code implementations • 16 Aug 2018 • Dongruo Zhou, Jinghui Chen, Yuan Cao, Yiqi Tang, Ziyan Yang, Quanquan Gu

In this paper, we provide a fine-grained convergence analysis for a general class of adaptive gradient methods including AMSGrad, RMSProp and AdaGrad.

Paper
Add Code

Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks

no code implementations • 21 Nov 2018 • Difan Zou, Yuan Cao, Dongruo Zhou, Quanquan Gu

In particular, we study the binary classification problem and show that for a broad family of loss functions, with proper random weight initialization, both gradient descent and stochastic gradient descent can find the global minima of the training loss for an over-parameterized deep ReLU network, under mild assumption on the training data.

Binary Classification

Paper
Add Code

A Frank-Wolfe Framework for Efficient and Effective Adversarial Attacks

2 code implementations • ICLR 2019 • Jinghui Chen, Dongruo Zhou, Jin-Feng Yi, Quanquan Gu

Depending on how much information an adversary can access to, adversarial attacks can be classified as white-box attack and black-box attack.

Adversarial Attack

Paper
Code

Sample Efficient Stochastic Variance-Reduced Cubic Regularization Method

no code implementations • 29 Nov 2018 • Dongruo Zhou, Pan Xu, Quanquan Gu

The proposed algorithm achieves a lower sample complexity of Hessian matrix computation than existing cubic regularization based methods.

Paper
Add Code

Third-order Smoothness Helps: Faster Stochastic Optimization Algorithms for Finding Local Minima

no code implementations • NeurIPS 2018 • Yaodong Yu, Pan Xu, Quanquan Gu

Stochastic Optimization

Paper
Add Code

Distributed Learning without Distress: Privacy-Preserving Empirical Risk Minimization

1 code implementation • NeurIPS 2018 • Bargav Jayaraman, Lingxiao Wang, David Evans, Quanquan Gu

We explore two popular methods of differential privacy, output perturbation and gradient perturbation, and advance the state-of-the-art for both methods in the distributed learning setting.

Privacy Preserving

Paper
Code

Stochastic Nested Variance Reduced Gradient Descent for Nonconvex Optimization

no code implementations • NeurIPS 2018 • Dongruo Zhou, Pan Xu, Quanquan Gu

We study finite-sum nonconvex optimization problems, where the objective function is an average of $n$ nonconvex functions.

Paper
Add Code

Lower Bounds for Smooth Nonconvex Finite-Sum Optimization

no code implementations • 31 Jan 2019 • Dongruo Zhou, Quanquan Gu

We prove tight lower bounds for the complexity of finding $\epsilon$-suboptimal point and $\epsilon$-approximate stationary point in different settings, for a wide regime of the smallest eigenvalue of the Hessian of the objective function (or each component function).

Paper
Add Code

Stochastic Recursive Variance-Reduced Cubic Regularization Methods

no code implementations • 31 Jan 2019 • Dongruo Zhou, Quanquan Gu

Built upon SRVRC, we further propose a Hessian-free SRVRC algorithm, namely SRVRC$_{\text{free}}$, which only requires stochastic gradient and Hessian-vector product computations, and achieves $\tilde O(dn\epsilon^{-2} \land d\epsilon^{-3})$ runtime complexity, where $n$ is the number of component functions in the finite-sum structure, $d$ is the problem dimension, and $\epsilon$ is the optimization precision.

Paper
Add Code

Generalization Error Bounds of Gradient Descent for Learning Over-parameterized Deep ReLU Networks

no code implementations • 4 Feb 2019 • Yuan Cao, Quanquan Gu

However, existing generalization error bounds are unable to explain the good generalization performance of over-parameterized DNNs.

Generalization Bounds

Paper
Add Code

An Improved Convergence Analysis of Stochastic Variance-Reduced Policy Gradient

no code implementations • 29 May 2019 • Pan Xu, Felicia Gao, Quanquan Gu

We revisit the stochastic variance-reduced policy gradient (SVRPG) method proposed by Papini et al. (2018) for reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks

no code implementations • NeurIPS 2019 • Yuan Cao, Quanquan Gu

We study the training and generalization of deep neural networks (DNNs) in the over-parameterized regime, where the network width (i. e., number of hidden nodes per layer) is much larger than the number of training data points.

Generalization Bounds

Paper
Add Code

An Improved Analysis of Training Over-parameterized Deep Neural Networks

no code implementations • NeurIPS 2019 • Difan Zou, Quanquan Gu

A recent line of research has shown that gradient-based algorithms with random initialization can converge to the global minima of the training loss for over-parameterized (i. e., sufficiently wide) deep neural networks.

Paper
Add Code

DP-LSSGD: A Stochastic Optimization Method to Lift the Utility in Privacy-Preserving ERM

1 code implementation • 28 Jun 2019 • Bao Wang, Quanquan Gu, March Boedihardjo, Farzin Barekat, Stanley J. Osher

At the core of DP-LSSGD is the Laplacian smoothing, which smooths out the Gaussian noise used in the Gaussian mechanism.

Privacy Preserving Stochastic Optimization

Paper
Code

A Knowledge Transfer Framework for Differentially Private Sparse Learning

no code implementations • 13 Sep 2019 • Lingxiao Wang, Quanquan Gu

We study the problem of estimating high dimensional models with underlying sparse structures while preserving the privacy of each training example.

regression Sparse Learning +1

Paper
Add Code

Sample Efficient Policy Gradient Methods with Recursive Variance Reduction

1 code implementation • ICLR 2020 • Pan Xu, Felicia Gao, Quanquan Gu

Improving the sample efficiency in reinforcement learning has been a long-standing research problem.

Policy Gradient Methods reinforcement-learning +1

Paper
Code

NeuralUCB: Contextual Bandits with Neural Network-Based Exploration

no code implementations • 25 Sep 2019 • Dongruo Zhou, Lihong Li, Quanquan Gu

To the best of our knowledge, our algorithm is the first neural network-based contextual bandit algorithm with near-optimal regret guarantee.

Efficient Exploration Multi-Armed Bandits

Paper
Add Code

Training Deep Neural Networks with Partially Adaptive Momentum

no code implementations • 25 Sep 2019 • Jinghui Chen, Dongruo Zhou, Yiqi Tang, Ziyan Yang, Yuan Cao, Quanquan Gu

Experiments on standard benchmarks show that our proposed algorithm can maintain fast convergence rate as Adam/Amsgrad while generalizing as well as SGD in training deep neural networks.

Paper
Add Code

On the Dynamics and Convergence of Weight Normalization for Training Neural Networks

no code implementations • 25 Sep 2019 • Yonatan Dukler, Quanquan Gu, Guido Montufar

We present a proof of convergence for ReLU networks trained with weight normalization.

Paper
Add Code

Algorithm-Dependent Generalization Bounds for Overparameterized Deep Residual Networks

no code implementations • NeurIPS 2019 • Spencer Frei, Yuan Cao, Quanquan Gu

The skip-connections used in residual networks have become a standard architecture choice in deep learning due to the increased training stability and generalization performance with this architecture, although there has been limited theoretical understanding for this improvement.

Generalization Bounds

Paper
Add Code

Efficient Privacy-Preserving Stochastic Nonconvex Optimization

no code implementations • 30 Oct 2019 • Lingxiao Wang, Bargav Jayaraman, David Evans, Quanquan Gu

While many solutions for privacy-preserving convex empirical risk minimization (ERM) have been developed, privacy-preserving nonconvex ERM remains a challenge.

Privacy Preserving

Paper
Add Code

Laplacian Smoothing Stochastic Gradient Markov Chain Monte Carlo

1 code implementation • 2 Nov 2019 • Bao Wang, Difan Zou, Quanquan Gu, Stanley Osher

As an important Markov Chain Monte Carlo (MCMC) method, stochastic gradient Langevin dynamics (SGLD) algorithm has achieved great success in Bayesian learning and posterior sampling.

Paper
Code

Neural Contextual Bandits with UCB-based Exploration

4 code implementations • ICML 2020 • Dongruo Zhou, Lihong Li, Quanquan Gu

To the best of our knowledge, it is the first neural network-based contextual bandit algorithm with a near-optimal regret guarantee.

Efficient Exploration Multi-Armed Bandits

Paper
Code

Tight Sample Complexity of Learning One-hidden-layer Convolutional Neural Networks

no code implementations • NeurIPS 2019 • Yuan Cao, Quanquan Gu

We study the sample complexity of learning one-hidden-layer convolutional neural networks (CNNs) with non-overlapping filters.

Paper
Add Code

Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks

1 code implementation • NeurIPS 2019 • Difan Zou, Ziniu Hu, Yewen Wang, Song Jiang, Yizhou Sun, Quanquan Gu

Original full-batch GCN training requires calculating the representation of all the nodes in the graph per GCN layer, which brings in high computation and memory costs.

Node Classification

Paper
Code

How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?

no code implementations • ICLR 2021 • Zixiang Chen, Yuan Cao, Difan Zou, Quanquan Gu

A recent line of research on deep learning focuses on the extremely over-parameterized setting, and shows that when the network width is larger than a high degree polynomial of the training sample size $n$ and the inverse of the target error $\epsilon^{-1}$, deep neural networks learned by (stochastic) gradient descent enjoy nice optimization and generalization guarantees.

Open-Ended Question Answering

Paper
Add Code

Stochastic Gradient Hamiltonian Monte Carlo Methods with Recursive Variance Reduction

1 code implementation • NeurIPS 2019 • Difan Zou, Pan Xu, Quanquan Gu

Stochastic Gradient Hamiltonian Monte Carlo (SGHMC) algorithms have received increasing attention in both theory and practice.

Paper
Code

Rank Aggregation via Heterogeneous Thurstone Preference Models

1 code implementation • 3 Dec 2019 • Tao Jin, Pan Xu, Quanquan Gu, Farzad Farnoud

By allowing different noise distributions, the proposed HTM model maintains the generality of Thurstone's original framework, and as such, also extends the Bradley-Terry-Luce (BTL) model for pairwise comparisons to heterogeneous populations of users.

Paper
Code

Towards Understanding the Spectral Bias of Deep Learning

no code implementations • 3 Dec 2019 • Yuan Cao, Zhiying Fang, Yue Wu, Ding-Xuan Zhou, Quanquan Gu

An intriguing phenomenon observed during training neural networks is the spectral bias, which states that neural networks are biased towards learning less complex functions.

Paper
Add Code

A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation

no code implementations • 10 Dec 2019 • Pan Xu, Quanquan Gu

Q-learning with neural network function approximation (neural Q-learning for short) is among the most prevalent deep reinforcement learning algorithms.

Q-Learning Reinforcement Learning (RL)

Paper
Add Code

A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks

no code implementations • NeurIPS 2020 • Zixiang Chen, Yuan Cao, Quanquan Gu, Tong Zhang

In this paper, we provide a generalized neural tangent kernel analysis and show that noisy gradient descent with weight decay can still exhibit a "kernel-like" behavior.

Learning Theory Vocal Bursts Valence Prediction

Paper
Add Code

Double Explore-then-Commit: Asymptotic Optimality and Beyond

no code implementations • 21 Feb 2020 • Tianyuan Jin, Pan Xu, Xiaokui Xiao, Quanquan Gu

In this paper, we show that a variant of ETC algorithm can actually achieve the asymptotic optimality for multi-armed bandit problems as UCB-type algorithms do and extend it to the batched bandit setting.

Paper
Add Code

Understanding the Intrinsic Robustness of Image Distributions using Conditional Generative Models

1 code implementation • 1 Mar 2020 • Xiao Zhang, Jinghui Chen, Quanquan Gu, David Evans

Starting with Gilmer et al. (2018), several works have demonstrated the inevitability of adversarial examples based on different assumptions about the underlying input probability space.

Adversarial Robustness

Paper
Code

On the Global Convergence of Training Deep Linear ResNets

no code implementations • ICLR 2020 • Difan Zou, Philip M. Long, Quanquan Gu

We further propose a modified identity input and output transformations, and show that a $(d+k)$-wide neural network is sufficient to guarantee the global convergence of GD/SGD, where $d, k$ are the input and output dimensions respectively.

Paper
Add Code

MOTS: Minimax Optimal Thompson Sampling

no code implementations • 3 Mar 2020 • Tianyuan Jin, Pan Xu, Jieming Shi, Xiaokui Xiao, Quanquan Gu

Thompson sampling is one of the most widely used algorithms for many online decision problems, due to its simplicity in implementation and superior empirical performance over other state-of-the-art methods.

Thompson Sampling

Paper
Add Code

Improving Adversarial Robustness Requires Revisiting Misclassified Examples

1 code implementation • ICLR 2020 • Yisen Wang, Difan Zou, Jin-Feng Yi, James Bailey, Xingjun Ma, Quanquan Gu

In this paper, we investigate the distinctive influence of misclassified and correctly classified examples on the final robustness of adversarial training.

Adversarial Robustness

136

Paper
Code

Differentially Private Federated Learning with Laplacian Smoothing

no code implementations • 1 May 2020 • Zhicong Liang, Bao Wang, Quanquan Gu, Stanley Osher, Yuan YAO

Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.

Federated Learning

Paper
Add Code

Improving Neural Language Generation with Spectrum Control

no code implementations • ICLR 2020 • Lingxiao Wang, Jing Huang, Kevin Huang, Ziniu Hu, Guangtao Wang, Quanquan Gu

Recent Transformer-based models such as Transformer-XL and BERT have achieved huge success on various natural language processing tasks.

Language Modelling Machine Translation +2

Paper
Add Code

A Finite Time Analysis of Two Time-Scale Actor Critic Methods

no code implementations • 4 May 2020 • Yue Wu, Weitong Zhang, Pan Xu, Quanquan Gu

In this work, we provide a non-asymptotic analysis for two time-scale actor-critic methods under non-i. i. d.

Vocal Bursts Valence Prediction

Paper
Add Code

Revisiting Membership Inference Under Realistic Assumptions

1 code implementation • 21 May 2020 • Bargav Jayaraman, Lingxiao Wang, Katherine Knipmeyer, Quanquan Gu, David Evans

Since previous inference attacks fail in imbalanced prior setting, we develop a new inference attack based on the intuition that inputs corresponding to training set members will be near a local minimum in the loss function, and show that an attack that combines this with thresholds on the per-instance loss can achieve high PPV even in settings where other attacks appear to be ineffective.

Inference Attack

126

Paper
Code

Agnostic Learning of a Single Neuron with Gradient Descent

no code implementations • NeurIPS 2020 • Spencer Frei, Yuan Cao, Quanquan Gu

In the agnostic PAC learning setting, where no assumption on the relationship between the labels $y$ and the input $x$ is made, if the optimal population risk is $\mathsf{OPT}$, we show that gradient descent achieves population risk $O(\mathsf{OPT})+\epsilon$ in polynomial time and sample complexity when $\sigma$ is strictly increasing.

PAC learning

Paper
Add Code

Optimization Theory for ReLU Neural Networks Trained with Normalization Layers

no code implementations • ICML 2020 • Yonatan Dukler, Quanquan Gu, Guido Montúfar

The success of deep neural networks is in part due to the use of normalization layers.

Learning Theory

Paper
Add Code

Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping

no code implementations • 23 Jun 2020 • Dongruo Zhou, Jiafan He, Quanquan Gu

We propose a novel algorithm that makes use of the feature mapping and obtains a $\tilde O(d\sqrt{T}/(1-\gamma)^2)$ regret, where $d$ is the dimension of the feature space, $T$ is the time horizon and $\gamma$ is the discount factor of the MDP.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

RayS: A Ray Searching Method for Hard-label Adversarial Attack

1 code implementation • 23 Jun 2020 • Jinghui Chen, Quanquan Gu

Deep neural networks are vulnerable to adversarial attacks.

Ranked #1 on Hard-label Attack on MNIST

Adversarial Attack Hard-label Attack

Paper
Code

Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs

no code implementations • NeurIPS 2021 • Jiafan He, Dongruo Zhou, Quanquan Gu

We study the reinforcement learning problem for discounted Markov Decision Processes (MDPs) under the tabular setting.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Agnostic Learning of Halfspaces with Gradient Descent via Soft Margins

no code implementations • 1 Oct 2020 • Spencer Frei, Yuan Cao, Quanquan Gu

We analyze the properties of gradient descent on convex surrogates for the zero-one loss for the agnostic learning of linear halfspaces.

General Classification

Paper
Add Code

Neural Thompson Sampling

2 code implementations • ICLR 2021 • Weitong Zhang, Dongruo Zhou, Lihong Li, Quanquan Gu

Thompson Sampling (TS) is one of the most effective algorithms for solving contextual multi-armed bandit problems.

Multi-Armed Bandits Thompson Sampling

Paper
Code

Efficient Robust Training via Backward Smoothing

1 code implementation • 3 Oct 2020 • Jinghui Chen, Yu Cheng, Zhe Gan, Quanquan Gu, Jingjing Liu

In this work, we develop a new understanding towards Fast Adversarial Training, by viewing random initialization as performing randomized smoothing for better optimization of the inner maximization problem.

Paper
Code

Do Wider Neural Networks Really Help Adversarial Robustness?

1 code implementation • NeurIPS 2021 • Boxi Wu, Jinghui Chen, Deng Cai, Xiaofei He, Quanquan Gu

Previous empirical results suggest that adversarial training requires wider networks for better performances.

Adversarial Robustness

607

Paper
Code

Faster Convergence of Stochastic Gradient Langevin Dynamics for Non-Log-Concave Sampling

no code implementations • 19 Oct 2020 • Difan Zou, Pan Xu, Quanquan Gu

We provide a new convergence analysis of stochastic gradient Langevin dynamics (SGLD) for sampling from a class of distributions that can be non-log-concave.

Paper
Add Code

Direction Matters: On the Implicit Bias of Stochastic Gradient Descent with Moderate Learning Rate

no code implementations • ICLR 2021 • Jingfeng Wu, Difan Zou, Vladimir Braverman, Quanquan Gu

Understanding the algorithmic bias of \emph{stochastic gradient descent} (SGD) is one of the key challenges in modern machine learning and deep learning theory.

Learning Theory

Paper
Add Code

Provable Multi-Objective Reinforcement Learning with Generative Models

no code implementations • 19 Nov 2020 • Dongruo Zhou, Jiahao Chen, Quanquan Gu

Multi-objective reinforcement learning (MORL) is an extension of ordinary, single-objective reinforcement learning (RL) that is applicable to many real-world tasks where multiple objectives exist without known relative costs.

Multi-Objective Reinforcement Learning Q-Learning +1

Paper
Add Code

Logarithmic Regret for Reinforcement Learning with Linear Function Approximation

no code implementations • 23 Nov 2020 • Jiafan He, Dongruo Zhou, Quanquan Gu

Reinforcement learning (RL) with linear function approximation has received increasing attention recently.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

A Finite-Time Analysis of Two Time-Scale Actor-Critic Methods

no code implementations • NeurIPS 2020 • Yue Wu, Weitong Zhang, Pan Xu, Quanquan Gu

In this work, we provide a non-asymptotic analysis for two time-scale actor-critic methods under non-i. i. d.

Vocal Bursts Valence Prediction

Paper
Add Code

Neural Contextual Bandits with Deep Representation and Shallow Exploration

no code implementations • NeurIPS 2021 • Pan Xu, Zheng Wen, Handong Zhao, Quanquan Gu

We study a general class of contextual bandits, where each context-action pair is associated with a raw feature vector, but the reward generating function is unknown.

Multi-Armed Bandits Representation Learning

Paper
Add Code

Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov Decision Processes

no code implementations • 15 Dec 2020 • Dongruo Zhou, Quanquan Gu, Csaba Szepesvari

Based on the new inequality, we propose a new, computationally efficient algorithm with linear function approximation named $\text{UCRL-VTR}^{+}$ for the aforementioned linear mixture MDPs in the episodic undiscounted setting.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise

1 code implementation • 4 Jan 2021 • Spencer Frei, Yuan Cao, Quanquan Gu

We consider a one-hidden-layer leaky ReLU network of arbitrary width trained by stochastic gradient descent (SGD) following an arbitrary initialization.

Paper
Code

Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints

no code implementations • NeurIPS 2021 • Tianhao Wang, Dongruo Zhou, Quanquan Gu

In specific, for the batch learning model, our proposed LSVI-UCB-Batch algorithm achieves an $\tilde O(\sqrt{d^3H^3T} + dHT/B)$ regret, where $d$ is the dimension of the feature mapping, $H$ is the episode length, $T$ is the number of interactions and $B$ is the number of batches.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Almost Optimal Algorithms for Two-player Zero-Sum Linear Mixture Markov Games

no code implementations • 15 Feb 2021 • Zixiang Chen, Dongruo Zhou, Quanquan Gu

To assess the optimality of our algorithm, we also prove an $\tilde{\Omega}( dH\sqrt{T})$ lower bound on the regret.

Paper
Add Code

Nearly Minimax Optimal Regret for Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation

no code implementations • 15 Feb 2021 • Yue Wu, Dongruo Zhou, Quanquan Gu

We study reinforcement learning in an infinite-horizon average-reward setting with linear function approximation, where the transition probability function of the underlying Markov Decision Process (MDP) admits a linear form over a feature mapping of the current state, action, and next state.

Paper
Add Code

Near-optimal Policy Optimization Algorithms for Learning Adversarial Linear Mixture MDPs

no code implementations • 17 Feb 2021 • Jiafan He, Dongruo Zhou, Quanquan Gu

In this paper, we study RL in episodic MDPs with adversarial reward and full information feedback, where the unknown transition probability function is a linear function of a given feature mapping, and the reward function can change arbitrarily episode by episode.

Reinforcement Learning (RL)

Paper
Add Code

Batched Neural Bandits

no code implementations • 25 Feb 2021 • Quanquan Gu, Amin Karbasi, Khashayar Khosravi, Vahab Mirrokni, Dongruo Zhou

In many sequential decision-making problems, the individuals are split into several batches and the decision-maker is only allowed to change her policy at the end of batches.

Decision Making

Paper
Add Code

Benign Overfitting of Constant-Stepsize SGD for Linear Regression

no code implementations • 23 Mar 2021 • Difan Zou, Jingfeng Wu, Vladimir Braverman, Quanquan Gu, Sham M. Kakade

More specifically, for SGD with iterate averaging, we demonstrate the sharpness of the established excess risk bound by proving a matching lower bound (up to constant factors).

regression

Paper
Add Code

Provable Robustness of Adversarial Training for Learning Halfspaces with Noise

no code implementations • 19 Apr 2021 • Difan Zou, Spencer Frei, Quanquan Gu

To the best of our knowledge, this is the first work to show that adversarial training provably yields robust classifiers in the presence of noise.

Classification General Classification +1

Paper
Add Code

Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures

no code implementations • NeurIPS 2021 • Yuan Cao, Quanquan Gu, Mikhail Belkin

In this paper, we study this "benign overfitting" phenomenon of the maximum margin classifier for linear classification problems.

Classification General Classification +1

Paper
Add Code

Variance-reduced First-order Meta-learning for Natural Language Processing Tasks

no code implementations • NAACL 2021 • Lingxiao Wang, Kevin Huang, Tengyu Ma, Quanquan Gu, Jing Huang

The core of our algorithm is to introduce a novel variance reduction term to the gradient estimation when performing the task adaptation.

dialog state tracking Few-Shot Text Classification +2

Paper
Add Code

Provably Efficient Representation Selection in Low-rank Markov Decision Processes: From Online to Offline RL

no code implementations • 22 Jun 2021 • Weitong Zhang, Jiafan He, Dongruo Zhou, Amy Zhang, Quanquan Gu

For the offline counterpart, ReLEX-LCB, we show that the algorithm can find the optimal policy if the representation class can cover the state-action space and achieves gap-dependent sample complexity.

Offline RL reinforcement-learning +2

Paper
Add Code

Variance-Aware Off-Policy Evaluation with Linear Function Approximation

no code implementations • NeurIPS 2021 • Yifei Min, Tianhao Wang, Dongruo Zhou, Quanquan Gu

We study the off-policy evaluation (OPE) problem in reinforcement learning with linear function approximation, which aims to estimate the value function of a target policy based on the offline data collected by a behavior policy.

Off-policy evaluation

Paper
Add Code

Uniform-PAC Bounds for Reinforcement Learning with Linear Function Approximation

no code implementations • NeurIPS 2021 • Jiafan He, Dongruo Zhou, Quanquan Gu

The uniform-PAC guarantee is the strongest possible guarantee for reinforcement learning in the literature, which can directly imply both PAC and high probability regret bounds, making our algorithm superior to all existing algorithms with linear function approximation.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Pure Exploration in Kernel and Neural Bandits

no code implementations • NeurIPS 2021 • Yinglun Zhu, Dongruo Zhou, Ruoxi Jiang, Quanquan Gu, Rebecca Willett, Robert Nowak

To overcome the curse of dimensionality, we propose to adaptively embed the feature representation of each arm into a lower-dimensional space and carefully deal with the induced model misspecification.

Paper
Add Code

Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent

no code implementations • NeurIPS 2021 • Spencer Frei, Quanquan Gu

We further show that many existing guarantees for neural networks trained by gradient descent can be unified through proxy convexity and proxy PL inequalities.

Paper
Add Code

Self-training Converts Weak Learners to Strong Learners in Mixture Models

no code implementations • 25 Jun 2021 • Spencer Frei, Difan Zou, Zixiang Chen, Quanquan Gu

We show that there exists a universal constant $C_{\mathrm{err}}>0$ such that if a pseudolabeler $\boldsymbol{\beta}_{\mathrm{pl}}$ can achieve classification error at most $C_{\mathrm{err}}$, then for any $\varepsilon>0$, an iterative self-training algorithm initialized at $\boldsymbol{\beta}_0 := \boldsymbol{\beta}_{\mathrm{pl}}$ using pseudolabels $\hat y = \mathrm{sgn}(\langle \boldsymbol{\beta}_t, \mathbf{x}\rangle)$ and using at most $\tilde O(d/\varepsilon^2)$ unlabeled examples suffices to learn the Bayes-optimal classifier up to $\varepsilon$ error, where $d$ is the ambient dimension.

Binary Classification

Paper
Add Code

The Benefits of Implicit Regularization from SGD in Least Squares Problems

no code implementations • NeurIPS 2021 • Difan Zou, Jingfeng Wu, Vladimir Braverman, Quanquan Gu, Dean P. Foster, Sham M. Kakade

Stochastic gradient descent (SGD) exhibits strong algorithmic regularization effects in practice, which has been hypothesized to play an important role in the generalization of modern machine learning approaches.

regression

Paper
Add Code

Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization

no code implementations • 25 Aug 2021 • Difan Zou, Yuan Cao, Yuanzhi Li, Quanquan Gu

In this paper, we provide a theoretical explanation for this phenomenon: we show that in the nonconvex setting of learning over-parameterized two-layer convolutional neural networks starting from the same random initialization, for a class of data distributions (inspired from image data), Adam and gradient descent (GD) can converge to different global solutions of the training objective with provably different generalization errors, even with weight decay regularization.

Image Classification

Paper
Add Code

Iterative Teacher-Aware Learning

1 code implementation • NeurIPS 2021 • Luyao Yuan, Dongruo Zhou, Junhong Shen, Jingdong Gao, Jeffrey L. Chen, Quanquan Gu, Ying Nian Wu, Song-Chun Zhu

Recently, the benefits of integrating this cooperative pedagogy into machine concept learning in discrete spaces have been proved by multiple works.

Paper
Code

Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks

1 code implementation • NeurIPS 2021 • Hanxun Huang, Yisen Wang, Sarah Monazam Erfani, Quanquan Gu, James Bailey, Xingjun Ma

Specifically, we make the following key observations: 1) more parameters (higher model capacity) does not necessarily help adversarial robustness; 2) reducing capacity at the last stage (the last group of blocks) of the network can actually improve adversarial robustness; and 3) under the same parameter budget, there exists an optimal architectural configuration for adversarial robustness.

Adversarial Robustness

Paper
Code

Adaptive Sampling for Heterogeneous Rank Aggregation from Noisy Pairwise Comparisons

1 code implementation • 8 Oct 2021 • Yue Wu, Tao Jin, Hao Lou, Pan Xu, Farzad Farnoud, Quanquan Gu

In heterogeneous rank aggregation problems, users often exhibit various accuracy levels when comparing pairs of items.

Paper
Code

Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression

no code implementations • 12 Oct 2021 • Jingfeng Wu, Difan Zou, Vladimir Braverman, Quanquan Gu, Sham M. Kakade

In this paper, we provide a problem-dependent analysis on the last iterate risk bounds of SGD with decaying stepsize, for (overparameterized) linear regression problems.

regression

Paper
Add Code

Reward-Free Model-Based Reinforcement Learning with Linear Function Approximation

no code implementations • NeurIPS 2021 • Weitong Zhang, Dongruo Zhou, Quanquan Gu

By constructing a special class of linear Mixture MDPs, we also prove that for any reward-free algorithm, it needs to sample at least $\tilde \Omega(H^2d\epsilon^{-2})$ episodes to obtain an $\epsilon$-optimal policy.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Adaptive Differentially Private Empirical Risk Minimization

no code implementations • 14 Oct 2021 • Xiaoxia Wu, Lingxiao Wang, Irina Cristali, Quanquan Gu, Rebecca Willett

We propose an adaptive (stochastic) gradient perturbation method for differentially private empirical risk minimization.

Paper
Add Code

Locally Differentially Private Reinforcement Learning for Linear Mixture Markov Decision Processes

no code implementations • 19 Oct 2021 • Chonghua Liao, Jiafan He, Quanquan Gu

To the best of our knowledge, this is the first provable privacy-preserving RL algorithm with linear function approximation.

Privacy Preserving reinforcement-learning +1

Paper
Add Code

Faster Perturbed Stochastic Gradient Methods for Finding Local Minima

no code implementations • NeurIPS 2021 • Zixiang Chen, Dongruo Zhou, Quanquan Gu

In this paper, we propose LENA (Last stEp shriNkAge), a faster perturbed stochastic gradient framework for finding local minima.

Paper
Add Code

Learning Stochastic Shortest Path with Linear Function Approximation

no code implementations • 25 Oct 2021 • Yifei Min, Jiafan He, Tianhao Wang, Quanquan Gu

To the best of our knowledge, this is the first algorithm with a sublinear regret guarantee for learning linear mixture SSP.

Paper
Add Code

Linear Contextual Bandits with Adversarial Corruptions

no code implementations • NeurIPS 2021 • Heyang Zhao, Dongruo Zhou, Quanquan Gu

We study the linear contextual bandit problem in the presence of adversarial corruption, where the interaction between the player and a possibly infinite decision set is contaminated by an adversary that can corrupt the reward up to a corruption level $C$ measured by the sum of the largest alteration on rewards in each round.

Multi-Armed Bandits

Paper
Add Code

On the Convergence and Robustness of Adversarial Training

no code implementations • 15 Dec 2021 • Yisen Wang, Xingjun Ma, James Bailey, JinFeng Yi, BoWen Zhou, Quanquan Gu

In this paper, we propose such a criterion, namely First-Order Stationary Condition for constrained optimization (FOSC), to quantitatively evaluate the convergence quality of adversarial examples found in the inner maximization.

Paper
Add Code

Benign Overfitting in Adversarially Robust Linear Classification

no code implementations • 31 Dec 2021 • Jinghui Chen, Yuan Cao, Quanquan Gu

Our result suggests that under moderate perturbations, adversarially trained linear classifiers can achieve the near-optimal standard and adversarial risks, despite overfitting the noisy training data.

Classification

Paper
Add Code

Learning Neural Contextual Bandits Through Perturbed Rewards

no code implementations • ICLR 2022 • Yiling Jia, Weitong Zhang, Dongruo Zhou, Quanquan Gu, Hongning Wang

Thanks to the power of representation learning, neural contextual bandit algorithms demonstrate remarkable performance improvement against their classical counterparts.

Computational Efficiency Multi-Armed Bandits +1

Paper
Add Code

Benign Overfitting in Two-layer Convolutional Neural Networks

no code implementations • 14 Feb 2022 • Yuan Cao, Zixiang Chen, Mikhail Belkin, Quanquan Gu

In this paper, we study the benign overfitting phenomenon in training a two-layer convolutional neural network (CNN).

Vocal Bursts Valence Prediction

Paper
Add Code

Optimal Online Generalized Linear Regression with Stochastic Noise and Its Application to Heteroscedastic Bandits

no code implementations • 28 Feb 2022 • Heyang Zhao, Dongruo Zhou, Jiafan He, Quanquan Gu

We study the problem of online generalized linear regression in the stochastic setting, where the label is generated from a generalized linear model with possibly unbounded additive noise.

regression

Paper
Add Code

Risk Bounds of Multi-Pass SGD for Least Squares in the Interpolation Regime

no code implementations • 7 Mar 2022 • Difan Zou, Jingfeng Wu, Vladimir Braverman, Quanquan Gu, Sham M. Kakade

Stochastic gradient descent (SGD) has achieved great success due to its superior performance in both optimization and generalization.

Paper
Add Code

On the Convergence of Certified Robust Training with Interval Bound Propagation

no code implementations • ICLR 2022 • Yihan Wang, Zhouxing Shi, Quanquan Gu, Cho-Jui Hsieh

Interval Bound Propagation (IBP) is so far the base of state-of-the-art methods for training neural networks with certifiable robustness guarantees when potential adversarial perturbations present, while the convergence of IBP training remains unknown in existing literature.

Paper
Add Code

Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions

no code implementations • 13 May 2022 • Jiafan He, Dongruo Zhou, Tong Zhang, Quanquan Gu

We show that for both known $C$ and unknown $C$ cases, our algorithm with proper choice of hyperparameter achieves a regret that nearly matches the lower bounds.

Multi-Armed Bandits

Paper
Add Code

Computationally Efficient Horizon-Free Reinforcement Learning for Linear Mixture MDPs

no code implementations • 23 May 2022 • Dongruo Zhou, Quanquan Gu

When applying our weighted least square estimator to heterogeneous linear bandits, we can obtain an $\tilde O(d\sqrt{\sum_{k=1}^K \sigma_k^2} +d)$ regret in the first $K$ rounds, where $d$ is the dimension of the context and $\sigma_k^2$ is the variance of the reward in the $k$-th round.

Multi-Armed Bandits reinforcement-learning +1

Paper
Add Code

A Simple and Provably Efficient Algorithm for Asynchronous Federated Contextual Linear Bandits

no code implementations • 7 Jul 2022 • Jiafan He, Tianhao Wang, Yifei Min, Quanquan Gu

To the best of our knowledge, this is the first provably efficient algorithm that allows fully asynchronous communication for federated contextual linear bandits, while achieving the same regret guarantee as in the single-agent setting.

Paper
Add Code

The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift

no code implementations • 3 Aug 2022 • Jingfeng Wu, Difan Zou, Vladimir Braverman, Quanquan Gu, Sham M. Kakade

Our bounds suggest that for a large class of linear regression instances, transfer learning with $O(N^2)$ source data (and scarce or no target data) is as effective as supervised learning with $N$ target data.

regression Transfer Learning

Paper
Add Code

Towards Understanding Mixture of Experts in Deep Learning

2 code implementations • 4 Aug 2022 • Zixiang Chen, Yihe Deng, Yue Wu, Quanquan Gu, Yuanzhi Li

To our knowledge, this is the first result towards formally understanding the mechanism of the MoE layer for deep learning.

Paper
Code

Learning Two-Player Mixture Markov Games: Kernel Function Approximation and Correlated Equilibrium

no code implementations • 10 Aug 2022 • Chris Junchi Li, Dongruo Zhou, Quanquan Gu, Michael I. Jordan

We consider learning Nash equilibria in two-player zero-sum Markov Games with nonlinear function approximation, where the action-value function is approximated by a function in a Reproducing Kernel Hilbert Space (RKHS).

Paper
Add Code

A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning

no code implementations • 30 Sep 2022 • Zixiang Chen, Chris Junchi Li, Angela Yuan, Quanquan Gu, Michael I. Jordan

With the increasing need for handling large state and action spaces, general function approximation has become a key technique in reinforcement learning (RL).

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Nesterov Meets Optimism: Rate-Optimal Separable Minimax Optimization

no code implementations • 31 Oct 2022 • Chris Junchi Li, Angela Yuan, Gauthier Gidel, Quanquan Gu, Michael I. Jordan

AG-OG is the first single-call algorithm with optimal convergence rates in both deterministic and stochastic settings for bilinearly coupled minimax optimization problems.

Paper
Add Code

Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes

no code implementations • 12 Dec 2022 • Chenlu Ye, Wei Xiong, Quanquan Gu, Tong Zhang

In this paper, we consider the contextual bandit with general function approximation and propose a computationally efficient algorithm to achieve a regret of $\tilde{O}(\sqrt{T}+\zeta)$.

Multi-Armed Bandits Reinforcement Learning (RL)

Paper
Add Code

Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes

no code implementations • 12 Dec 2022 • Jiafan He, Heyang Zhao, Dongruo Zhou, Quanquan Gu

We study reinforcement learning (RL) with linear function approximation.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Structure-informed Language Models Are Protein Designers

1 code implementation • 3 Feb 2023 • Zaixiang Zheng, Yifan Deng, Dongyu Xue, Yi Zhou, Fei Ye, Quanquan Gu

This paper demonstrates that language models are strong structure-based protein designers.

132

Paper
Code

Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency

no code implementations • 21 Feb 2023 • Heyang Zhao, Jiafan He, Dongruo Zhou, Tong Zhang, Quanquan Gu

We propose a variance-adaptive algorithm for linear mixture MDPs, which achieves a problem-dependent horizon-free regret bound that can gracefully reduce to a nearly constant regret for deterministic MDPs.

Computational Efficiency Decision Making +1

Paper
Add Code

Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron

no code implementations • 3 Mar 2023 • Jingfeng Wu, Difan Zou, Zixiang Chen, Vladimir Braverman, Quanquan Gu, Sham M. Kakade

On the other hand, we provide some negative results for stochastic gradient descent (SGD) for ReLU regression with symmetric Bernoulli data: if the model is well-specified, the excess risk of SGD is provably no better than that of GLM-tron ignoring constant factors, for each problem instance; and in the noiseless case, GLM-tron can achieve a small risk while SGD unavoidably suffers from a constant risk in expectation.

regression Vocal Bursts Intensity Prediction

Paper
Add Code

Benign Overfitting for Two-layer ReLU Convolutional Neural Networks

1 code implementation • 7 Mar 2023 • Yiwen Kou, Zixiang Chen, Yuanzhou Chen, Quanquan Gu

We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk.

Vocal Bursts Valence Prediction

Paper
Code

The Benefits of Mixup for Feature Learning

no code implementations • 15 Mar 2023 • Difan Zou, Yuan Cao, Yuanzhi Li, Quanquan Gu

We consider a feature-noise data model and show that Mixup training can effectively learn the rare features (appearing in a small fraction of data) from its mixture with the common features (appearing in a large fraction of data).

Data Augmentation

Paper
Add Code

Borda Regret Minimization for Generalized Linear Dueling Bandits

no code implementations • 15 Mar 2023 • Yue Wu, Tao Jin, Hao Lou, Farzad Farnoud, Quanquan Gu

To attain this lower bound, we propose an explore-then-commit type algorithm for the stochastic setting, which has a nearly matching regret upper bound $\tilde{O}(d^{2/3} T^{2/3})$.

Recommendation Systems

Paper
Add Code

On the Interplay Between Misspecification and Sub-optimality Gap in Linear Contextual Bandits

no code implementations • 16 Mar 2023 • Weitong Zhang, Jiafan He, Zhiyuan Fan, Quanquan Gu

We show that, when the misspecification level $\zeta$ is dominated by $\tilde O (\Delta / \sqrt{d})$ with $\Delta$ being the minimal sub-optimality gap and $d$ being the dimension of the contextual vectors, our algorithm enjoys the same gap-dependent regret bound $\tilde O (d^2/\Delta)$ as in the well-specified setting up to logarithmic factors.

Multi-Armed Bandits

Paper
Add Code

Optimal Horizon-Free Reward-Free Exploration for Linear Mixture MDPs

no code implementations • 17 Mar 2023 • Junkai Zhang, Weitong Zhang, Quanquan Gu

The sample complexity of our algorithm only has a polylogarithmic dependence on the planning horizon and therefore is "horizon-free".

Reinforcement Learning (RL)

Paper
Add Code

Personalized Federated Learning under Mixture of Distributions

1 code implementation • 1 May 2023 • Yue Wu, Shuaicheng Zhang, Wenchao Yu, Yanchi Liu, Quanquan Gu, Dawei Zhou, Haifeng Chen, Wei Cheng

The recent trend towards Personalized Federated Learning (PFL) has garnered significant attention as it allows for the training of models that are tailored to each client while maintaining data privacy.

Personalized Federated Learning Uncertainty Quantification

Paper
Code

Cooperative Multi-Agent Reinforcement Learning: Asynchronous Communication and Linear Function Approximation

no code implementations • 10 May 2023 • Yifei Min, Jiafan He, Tianhao Wang, Quanquan Gu

We study multi-agent reinforcement learning in the setting of episodic Markov decision processes, where multiple agents cooperate via communication through a central server.

Multi-agent Reinforcement Learning reinforcement-learning

Paper
Add Code

Uniform-PAC Guarantees for Model-Based RL with Bounded Eluder Dimension

no code implementations • 15 May 2023 • Yue Wu, Jiafan He, Quanquan Gu

Recently, there has been remarkable progress in reinforcement learning (RL) with general function approximation.

Open-Ended Question Answering Reinforcement Learning (RL)

Paper
Add Code

Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs

no code implementations • 15 May 2023 • Kaixuan Ji, Qingyue Zhao, Jiafan He, Weitong Zhang, Quanquan Gu

Recent studies have shown that episodic reinforcement learning (RL) is no harder than bandits when the total reward is bounded by $1$, and proved regret bounds that have a polylogarithmic dependence on the planning horizon $H$.

Open-Ended Question Answering reinforcement-learning +1

Paper
Add Code

Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey

no code implementations • 30 May 2023 • Chen Ling, Xujiang Zhao, Jiaying Lu, Chengyuan Deng, Can Zheng, Junxiang Wang, Tanmoy Chowdhury, Yun Li, Hejie Cui, Xuchao Zhang, Tianjiao Zhao, Amit Panalkar, Dhagash Mehta, Stefano Pasquali, Wei Cheng, Haoyu Wang, Yanchi Liu, Zhengzhang Chen, Haifeng Chen, Chris White, Quanquan Gu, Jian Pei, Carl Yang, Liang Zhao

In this article, we present a comprehensive survey on domain specification techniques for large language models, an emerging direction critical for large language model applications.

Chatbot Language Modelling +1

Paper
Add Code

The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks

no code implementations • 20 Jun 2023 • Yuan Cao, Difan Zou, Yuanzhi Li, Quanquan Gu

We show that when learning a linear model with batch normalization for binary classification, gradient descent converges to a uniform margin classifier on the training data with an $\exp(-\Omega(\log^2 t))$ convergence rate.

Binary Classification

Paper
Add Code

Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning

1 code implementation • 23 Aug 2023 • Jiasheng Ye, Zaixiang Zheng, Yu Bao, Lihua Qian, Quanquan Gu

We then reprogram pretrained masked language models into diffusion language models via diffusive adaptation, wherein task-specific finetuning and instruction finetuning are explored to unlock their versatility in solving general language tasks.

In-Context Learning Language Modelling +1

Paper
Code

Understanding Transferable Representation Learning and Zero-shot Transfer in CLIP

no code implementations • 2 Oct 2023 • Zixiang Chen, Yihe Deng, Yuanzhi Li, Quanquan Gu

Multi-modal learning has become increasingly popular due to its ability to leverage information from different data sources (e. g., text and images) to improve the model performance.

Image Generation Representation Learning +1

Paper
Add Code

Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning

no code implementations • 2 Oct 2023 • Qiwei Di, Heyang Zhao, Jiafan He, Quanquan Gu

However, limited works on offline RL with non-linear function approximation have instance-dependent regret guarantees.

Offline RL reinforcement-learning +1

Paper
Add Code

Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits

no code implementations • 2 Oct 2023 • Qiwei Di, Tao Jin, Yue Wu, Heyang Zhao, Farzad Farnoud, Quanquan Gu

Dueling bandits is a prominent framework for decision-making involving preferential feedback, a valuable feature that fits various applications involving human interaction, such as ranking, information retrieval, and recommendation systems.

Computational Efficiency Decision Making +2

Paper
Add Code

How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression?

no code implementations • 12 Oct 2023 • Jingfeng Wu, Difan Zou, Zixiang Chen, Vladimir Braverman, Quanquan Gu, Peter L. Bartlett

Transformers pretrained on diverse tasks exhibit remarkable in-context learning (ICL) capabilities, enabling them to solve unseen tasks solely based on input contexts without adjusting model parameters.

In-Context Learning regression

Paper
Add Code

Pure Exploration in Asynchronous Federated Bandits

no code implementations • 17 Oct 2023 • Zichen Wang, Chuanhao Li, Chenyu Song, Lianghui Wang, Quanquan Gu, Huazheng Wang

We study the federated pure exploration problem of multi-armed bandits and linear bandits, where $M$ agents cooperatively identify the best arm via communicating with the central server.

Multi-Armed Bandits

Paper
Add Code

Corruption-Robust Offline Reinforcement Learning with General Function Approximation

1 code implementation • NeurIPS 2023 • Chenlu Ye, Rui Yang, Quanquan Gu, Tong Zhang

Notably, under the assumption of single policy coverage and the knowledge of $\zeta$, our proposed algorithm achieves a suboptimality bound that is worsened by an additive factor of $\mathcal{O}(\zeta (C(\widehat{\mathcal{F}},\mu)n)^{-1})$ due to the corruption.

Offline RL reinforcement-learning +1

Paper
Code

Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves

3 code implementations • 7 Nov 2023 • Yihe Deng, Weitong Zhang, Zixiang Chen, Quanquan Gu

While it is widely acknowledged that the quality of a prompt, such as a question, significantly impacts the quality of the response provided by LLMs, a systematic method for crafting questions that LLMs can better comprehend is still underdeveloped.

1,609

Paper
Code

Risk Bounds of Accelerated SGD for Overparameterized Linear Regression

no code implementations • 23 Nov 2023 • Xuheng Li, Yihe Deng, Jingfeng Wu, Dongruo Zhou, Quanquan Gu

Additionally, when our analysis is specialized to linear regression in the strongly convex setting, it yields a tighter bound for bias error than the best-known result.

regression

Paper
Add Code

A Nearly Optimal and Low-Switching Algorithm for Reinforcement Learning with General Function Approximation

no code implementations • 26 Nov 2023 • Heyang Zhao, Jiafan He, Quanquan Gu

The exploration-exploitation dilemma has been a central challenge in reinforcement learning (RL) with complex model classes.

Q-Learning Reinforcement Learning (RL)

Paper
Add Code

Fast Sampling via De-randomization for Discrete Diffusion Models

no code implementations • 14 Dec 2023 • Zixiang Chen, Huizhuo Yuan, YongQian Li, Yiwen Kou, Junkai Zhang, Quanquan Gu

Despite its success in continuous spaces, discrete diffusion models, which apply to domains such as texts and natural languages, remain under-studied and often suffer from slow generation speed.

Image Generation Machine Translation +1

Paper
Add Code

Sparse PCA with Oracle Property

no code implementations • NeurIPS 2014 • Quanquan Gu, Zhaoran Wang, Han Liu

In particular, under a weak assumption on the magnitude of the population projection matrix, one estimator within this family exactly recovers the true support with high probability, has exact rank-$k$, and attains a $\sqrt{s/n}$ statistical rate of convergence with $s$ being the subspace sparsity level and $n$ the sample size.

Paper
Add Code

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

2 code implementations • 2 Jan 2024 • Zixiang Chen, Yihe Deng, Huizhuo Yuan, Kaixuan Ji, Quanquan Gu

In this paper, we delve into the prospect of growing a strong LLM out of a weak one without the need for acquiring additional human-annotated data.

1,227

Paper
Code

TrustLLM: Trustworthiness in Large Language Models

1 code implementation • 10 Jan 2024 • Lichao Sun, Yue Huang, Haoran Wang, Siyuan Wu, Qihui Zhang, Yuan Li, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bertie Vidgen, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric Xing, Furong Huang, Hao liu, Heng Ji, Hongyi Wang, huan zhang, Huaxiu Yao, Manolis Kellis, Marinka Zitnik, Meng Jiang, Mohit Bansal, James Zou, Jian Pei, Jian Liu, Jianfeng Gao, Jiawei Han, Jieyu Zhao, Jiliang Tang, Jindong Wang, Joaquin Vanschoren, John Mitchell, Kai Shu, Kaidi Xu, Kai-Wei Chang, Lifang He, Lifu Huang, Michael Backes, Neil Zhenqiang Gong, Philip S. Yu, Pin-Yu Chen, Quanquan Gu, ran Xu, Rex Ying, Shuiwang Ji, Suman Jana, Tianlong Chen, Tianming Liu, Tianyi Zhou, William Wang, Xiang Li, Xiangliang Zhang, Xiao Wang, Xing Xie, Xun Chen, Xuyu Wang, Yan Liu, Yanfang Ye, Yinzhi Cao, Yong Chen, Yue Zhao

This paper introduces TrustLLM, a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions.

Ethics Fairness

265

Paper
Code

Mitigating Object Hallucination in Large Vision-Language Models via Classifier-Free Guidance

no code implementations • 13 Feb 2024 • Linxi Zhao, Yihe Deng, Weitong Zhang, Quanquan Gu

The advancement of Large Vision-Language Models (LVLMs) has increasingly highlighted the critical issue of their tendency to hallucinate non-existing objects in the images.

Hallucination

Paper
Add Code

Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path

no code implementations • 14 Feb 2024 • Qiwei Di, Jiafan He, Dongruo Zhou, Quanquan Gu

Our algorithm achieves an $\tilde{\mathcal O}(dB_*\sqrt{K})$ regret bound, where $d$ is the dimension of the feature mapping in the linear transition kernel, $B_*$ is the upper bound of the total cumulative cost for the optimal policy, and $K$ is the number of episodes.

Paper
Add Code

Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption

no code implementations • 14 Feb 2024 • Chenlu Ye, Jiafan He, Quanquan Gu, Tong Zhang

We also prove a lower bound to show that the additive dependence on $C$ is optimal.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Reinforcement Learning from Human Feedback with Active Queries

no code implementations • 14 Feb 2024 • Kaixuan Ji, Jiafan He, Quanquan Gu

Aligning large language models (LLM) with human preference plays a key role in building modern generative models and can be achieved by reinforcement learning from human feedback (RLHF).

Active Learning reinforcement-learning

Paper
Add Code

Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation

no code implementations • 15 Feb 2024 • Huizhuo Yuan, Zixiang Chen, Kaixuan Ji, Quanquan Gu

Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI), especially when compared with the remarkable progress made in fine-tuning Large Language Models (LLMs).

Reinforcement Learning (RL) Text-to-Image Generation

Paper
Add Code

DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design

1 code implementation • 26 Feb 2024 • Jiaqi Guan, Xiangxin Zhou, Yuwei Yang, Yu Bao, Jian Peng, Jianzhu Ma, Qiang Liu, Liang Wang, Quanquan Gu

Designing 3D ligands within a target binding site is a fundamental task in drug discovery.

Avg Drug Discovery

Paper
Code

Diffusion Language Models Are Versatile Protein Learners

no code implementations • 28 Feb 2024 • Xinyou Wang, Zaixiang Zheng, Fei Ye, Dongyu Xue, ShuJian Huang, Quanquan Gu

This paper introduces diffusion protein language model (DPLM), a versatile protein language model that demonstrates strong generative and predictive capabilities for protein sequences.

Protein Language Model

Paper
Add Code

Causal Graph ODE: Continuous Treatment Effect Modeling in Multi-agent Dynamical Systems

no code implementations • 29 Feb 2024 • Zijie Huang, Jeehyun Hwang, Junkai Zhang, Jinwoo Baik, Weitong Zhang, Dominik Wodarz, Yizhou Sun, Quanquan Gu, Wei Wang

Real-world multi-agent systems are often dynamic and continuous, where the agents co-evolve and undergo changes in their trajectories and interactions over time.

counterfactual Decision Making

Paper
Add Code

DecompOpt: Controllable and Decomposed Diffusion Models for Structure-based Molecular Optimization

no code implementations • 7 Mar 2024 • Xiangxin Zhou, Xiwei Cheng, Yuwei Yang, Yu Bao, Liang Wang, Quanquan Gu

DecompOpt presents a new generation paradigm which combines optimization with conditional diffusion models to achieve desired properties while adhering to the molecular grammar.

Drug Discovery

Paper
Add Code

Protein Conformation Generation via Force-Guided SE(3) Diffusion Models

no code implementations • 21 Mar 2024 • Yan Wang, Lihao Wang, Yuning Shen, Yiqun Wang, Huizhuo Yuan, Yue Wu, Quanquan Gu

The conformational landscape of proteins is crucial to understanding their functionality in complex biological processes.

Paper
Add Code

Antigen-Specific Antibody Design via Direct Energy-based Preference Optimization

no code implementations • 25 Mar 2024 • Xiangxin Zhou, Dongyu Xue, Ruizhe Chen, Zaixiang Zheng, Liang Wang, Quanquan Gu

Antibody design, a crucial task with significant implications across various disciplines such as therapeutics and biology, presents considerable challenges due to its intricate nature.

Total Energy

Paper
Add Code

Feel-Good Thompson Sampling for Contextual Dueling Bandits

no code implementations • 9 Apr 2024 • Xuheng Li, Heyang Zhao, Quanquan Gu

In this paper, we propose a Thompson sampling algorithm, named FGTS. CDB, for linear contextual dueling bandits.

Decision Making Multi-Armed Bandits +1

Paper
Add Code

Settling Constant Regrets in Linear Markov Decision Processes

no code implementations • 16 Apr 2024 • Weitong Zhang, Zhiyuan Fan, Jiafan He, Quanquan Gu

To the best of our knowledge, Cert-LSVI-UCB is the first algorithm to achieve a constant, instance-dependent, high-probability regret bound in RL with linear function approximation for infinite runs without relying on prior distribution assumptions.

Reinforcement Learning (RL)

Paper
Add Code

Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback

no code implementations • 16 Apr 2024 • Qiwei Di, Jiafan He, Quanquan Gu

Learning from human feedback plays an important role in aligning generative models, such as large language models (LLM).

Paper
Add Code

Padam: Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks

no code implementations • ICLR 2019 • Jinghui Chen, Quanquan Gu

Experiments on standard benchmarks show that Padam can maintain fast convergence rate as Adam/Amsgrad while generalizing as well as SGD in training deep neural networks.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.