Search Results for author: Niao He

Found 69 papers, 12 papers with code

Primal Methods for Variational Inequality Problems with Functional Constraints

no code implementations19 Mar 2024 Liang Zhang, Niao He, Michael Muehlebach

In this work, we propose a simple primal method, termed Constrained Gradient Method (CGM), for addressing functional constrained variational inequality problems, without necessitating any information on the optimal Lagrange multipliers.

Navigate

Taming Nonconvex Stochastic Mirror Descent with General Bregman Divergence

no code implementations27 Feb 2024 Ilyas Fatkhullin, Niao He

This paper revisits the convergence of Stochastic Mirror Descent (SMD) in the contemporary nonconvex optimization setting.

Independent Learning in Constrained Markov Potential Games

1 code implementation27 Feb 2024 Philip Jordan, Anas Barakat, Niao He

We propose an independent policy gradient algorithm for learning approximate constrained Nash equilibria: Each agent observes their own actions and rewards, along with a shared state.

Multi-agent Reinforcement Learning

Truly No-Regret Learning in Constrained MDPs

no code implementations24 Feb 2024 Adrian Müller, Pragnya Alatur, Volkan Cevher, Giorgia Ramponi, Niao He

As Efroni et al. (2020) pointed out, it is an open question whether primal-dual algorithms can provably achieve sublinear regret if we do not allow error cancellations.

Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL

no code implementations8 Feb 2024 Jiawei Huang, Niao He, Andreas Krause

We study the sample complexity of reinforcement learning (RL) in Mean-Field Games (MFGs) with model-based function approximation that requires strategic exploration to find a Nash Equilibrium policy.

Computational Efficiency Reinforcement Learning (RL)

Efficiently Escaping Saddle Points for Non-Convex Policy Optimization

no code implementations15 Nov 2023 Sadegh Khorasani, Saber Salehkaleybar, Negar Kiyavash, Niao He, Matthias Grossglauser

Policy gradient (PG) is widely used in reinforcement learning due to its scalability and good performance.

Parameter-Agnostic Optimization under Relaxed Smoothness

no code implementations6 Nov 2023 Florian Hübler, Junchi Yang, Xiang Li, Niao He

However, as the assumption is relaxed to the more realistic $(L_0, L_1)$-smoothness, all existing convergence results still necessitate tuning of the stepsize.

Optimal Guarantees for Algorithmic Reproducibility and Gradient Complexity in Convex Optimization

no code implementations NeurIPS 2023 Liang Zhang, Junchi Yang, Amin Karbasi, Niao He

Particularly, given the inexact initialization oracle, our regularization-based algorithms achieve the best of both worlds - optimal reproducibility and near-optimal gradient complexity - for minimization and minimax optimization.

DPZero: Private Fine-Tuning of Language Models without Backpropagation

no code implementations14 Oct 2023 Liang Zhang, Bingcong Li, Kiran Koshy Thekumparampil, Sewoong Oh, Niao He

The widespread practice of fine-tuning large language models (LLMs) on domain-specific data faces two major challenges in memory and privacy.

A Convex Framework for Confounding Robust Inference

1 code implementation21 Sep 2023 Kei Ishikawa, Niao He, Takafumi Kanamori

We study policy evaluation of offline contextual bandits subject to unobserved confounders.

Model Selection Multi-Armed Bandits

Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity and Last-Iterate Convergence

1 code implementation8 Sep 2023 Jiduan Wu, Anas Barakat, Ilyas Fatkhullin, Niao He

Our main results are two-fold: (i) in the deterministic setting, we establish the first global last-iterate linear convergence result for the nested algorithm that seeks NE of zero-sum LQ games; (ii) in the model-free setting, we establish a~$\widetilde{\mathcal{O}}(\epsilon^{-2})$ sample complexity using a single-point ZO estimator.

Multi-agent Reinforcement Learning Policy Gradient Methods

Provably Convergent Policy Optimization via Metric-aware Trust Region Methods

no code implementations25 Jun 2023 Jun Song, Niao He, Lijun Ding, Chaoyue Zhao

Trust-region methods based on Kullback-Leibler divergence are pervasively used to stabilize policy optimization in reinforcement learning.

Continuous Control Policy Gradient Methods

Provably Learning Nash Policies in Constrained Markov Potential Games

no code implementations13 Jun 2023 Pragnya Alatur, Giorgia Ramponi, Niao He, Andreas Krause

Multi-agent reinforcement learning (MARL) addresses sequential decision-making problems with multiple agents, where each agent optimizes its own objective.

Decision Making Multi-agent Reinforcement Learning +1

Cancellation-Free Regret Bounds for Lagrangian Approaches in Constrained Markov Decision Processes

no code implementations12 Jun 2023 Adrian Müller, Pragnya Alatur, Giorgia Ramponi, Niao He

Unlike existing Lagrangian approaches, our algorithm achieves this regret without the need for the cancellation of errors.

Safe Reinforcement Learning

Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space

no code implementations2 Jun 2023 Anas Barakat, Ilyas Fatkhullin, Niao He

We consider the reinforcement learning (RL) problem with general utilities which consists in maximizing a function of the state-action occupancy measure.

Reinforcement Learning (RL)

On the Statistical Efficiency of Mean Field Reinforcement Learning with General Function Approximation

no code implementations18 May 2023 Jiawei Huang, Batuhan Yardim, Niao He

In this paper, we study the fundamental statistical efficiency of Reinforcement Learning in Mean-Field Control (MFC) and Mean-Field Game (MFG) with general model-based function approximation.

Kernel Conditional Moment Constraints for Confounding Robust Inference

2 code implementations26 Feb 2023 Kei Ishikawa, Niao He

It can be shown that our estimator contains the recently proposed sharp estimator by Dorn and Guo (2022) as a special case, and our method enables a novel extension of the classical marginal sensitivity model using f-divergence.

Multi-Armed Bandits

Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies

no code implementations3 Feb 2023 Ilyas Fatkhullin, Anas Barakat, Anastasia Kireeva, Niao He

Recently, the impressive empirical success of policy gradient (PG) methods has catalyzed the development of their theoretical foundations.

Policy Gradient Methods

Policy Mirror Ascent for Efficient and Independent Learning in Mean Field Games

no code implementations29 Dec 2022 Batuhan Yardim, Semih Cayci, Matthieu Geist, Niao He

Instead, we show that $N$ agents running policy mirror ascent converge to the Nash equilibrium of the regularized game within $\widetilde{\mathcal{O}}(\varepsilon^{-2})$ samples from a single sample trajectory without a population generative model, up to a standard $\mathcal{O}(\frac{1}{\sqrt{N}})$ error due to the mean field.

TiAda: A Time-scale Adaptive Algorithm for Nonconvex Minimax Optimization

no code implementations31 Oct 2022 Xiang Li, Junchi Yang, Niao He

Adaptive gradient methods have shown their ability to adjust the stepsizes on the fly in a parameter-agnostic manner, and empirically achieve faster convergence for solving minimization problems.

Finite-Time Analysis of Entropy-Regularized Neural Natural Actor-Critic Algorithm

no code implementations2 Jun 2022 Semih Cayci, Niao He, R. Srikant

Natural actor-critic (NAC) and its variants, equipped with the representation power of neural networks, have demonstrated impressive empirical success in solving Markov decision problems with large state spaces.

Nest Your Adaptive Algorithm for Parameter-Agnostic Nonconvex Minimax Optimization

no code implementations1 Jun 2022 Junchi Yang, Xiang Li, Niao He

Adaptive algorithms like AdaGrad and AMSGrad are successful in nonconvex optimization owing to their parameter-agnostic ability -- requiring no a priori knowledge about problem-specific parameters nor tuning of learning rates.

Bring Your Own Algorithm for Optimal Differentially Private Stochastic Minimax Optimization

no code implementations1 Jun 2022 Liang Zhang, Kiran Koshy Thekumparampil, Sewoong Oh, Niao He

We provide a general framework for solving differentially private stochastic minimax optimization (DP-SMO) problems, which enables the practitioners to bring their own base optimization algorithm and use it as a black-box to obtain the near-optimal privacy-loss trade-off.

Generalization Bounds of Nonconvex-(Strongly)-Concave Stochastic Minimax Optimization

no code implementations28 May 2022 Siqi Zhang, Yifan Hu, Liang Zhang, Niao He

We further study the algorithm-dependent generalization bounds via stability arguments of algorithms.

Generalization Bounds

Stochastic Second-Order Methods Improve Best-Known Sample Complexity of SGD for Gradient-Dominated Function

no code implementations25 May 2022 Saeed Masiha, Saber Salehkaleybar, Niao He, Negar Kiyavash, Patrick Thiran

We prove that the total sample complexity of SCRN in achieving $\epsilon$-global optimum is $\mathcal{O}(\epsilon^{-7/(2\alpha)+1})$ for $1\le\alpha< 3/2$ and $\mathcal{\tilde{O}}(\epsilon^{-2/(\alpha)})$ for $3/2\le\alpha\le 2$.

Policy Gradient Methods Reinforcement Learning (RL) +1

Momentum-Based Policy Gradient with Second-Order Information

no code implementations17 May 2022 Saber Salehkaleybar, Sadegh Khorasani, Negar Kiyavash, Niao He, Patrick Thiran

SHARP algorithm is parameter-free, achieving $\epsilon$-approximate first-order stationary point with $O(\epsilon^{-3})$ number of trajectories, while using a batch size of $O(1)$ at each iteration.

Policy Gradient Methods

Finite-Time Analysis of Natural Actor-Critic for POMDPs

no code implementations20 Feb 2022 Semih Cayci, Niao He, R. Srikant

We consider the reinforcement learning problem for partially observed Markov decision processes (POMDPs) with large or even countably infinite state spaces, where the controller has access to only noisy observations of the underlying controlled Markov chain.

Lifted Primal-Dual Method for Bilinearly Coupled Smooth Minimax Optimization

no code implementations19 Jan 2022 Kiran Koshy Thekumparampil, Niao He, Sewoong Oh

We also provide a direct single-loop algorithm, using the LPD method, that achieves the iteration complexity of $O(\sqrt{\frac{L_x}{\varepsilon}} + \frac{\|A\|}{\sqrt{\mu_y \varepsilon}} + \sqrt{\frac{L_y}{\varepsilon}})$.

Faster Single-loop Algorithms for Minimax Optimization without Strong Concavity

1 code implementation10 Dec 2021 Junchi Yang, Antonio Orvieto, Aurelien Lucchi, Niao He

Gradient descent ascent (GDA), the simplest single-loop algorithm for nonconvex minimax optimization, is widely used in practical applications such as generative adversarial networks (GANs) and adversarial training.

On the Bias-Variance-Cost Tradeoff of Stochastic Optimization

no code implementations NeurIPS 2021 Yifan Hu, Xin Chen, Niao He

We consider stochastic optimization when one only has access to biased stochastic oracles of the objective, and obtaining stochastic gradients with low biases comes at high costs.

Bilevel Optimization Stochastic Optimization

Efficient Wasserstein and Sinkhorn Policy Optimization

no code implementations29 Sep 2021 Jun Song, Chaoyue Zhao, Niao He

Trust-region methods based on Kullback-Leibler divergence are pervasively used to stabilize policy optimization in reinforcement learning.

Policy Gradient Methods Reinforcement Learning (RL)

Sample-efficient actor-critic algorithms with an etiquette for zero-sum Markov games

no code implementations29 Sep 2021 Ahmet Alacaoglu, Luca Viano, Niao He, Volkan Cevher

Our sample complexities also match the best-known results for global convergence of policy gradient and two time-scale actor-critic algorithms in the single agent setting.

Policy Gradient Methods

Linear Convergence of Entropy-Regularized Natural Policy Gradient with Linear Function Approximation

no code implementations8 Jun 2021 Semih Cayci, Niao He, R. Srikant

Furthermore, under mild regularity conditions on the concentrability coefficient and basis vectors, we prove that entropy-regularized NPG exhibits \emph{linear convergence} up to a function approximation error.

The Complexity of Nonconvex-Strongly-Concave Minimax Optimization

no code implementations29 Mar 2021 Siqi Zhang, Junchi Yang, Cristóbal Guzmán, Negar Kiyavash, Niao He

In the averaged smooth finite-sum setting, our proposed algorithm improves over previous algorithms by providing a nearly-tight dependence on the condition number.

Simulation Studies on Deep Reinforcement Learning for Building Control with Human Interaction

no code implementations14 Mar 2021 Donghwan Lee, Niao He, Seungjae Lee, Panagiota Karava, Jianghai Hu

The building sector consumes the largest energy in the world, and there have been considerable research interests in energy consumption and comfort management of buildings.

Management reinforcement-learning +1

Sample Complexity and Overparameterization Bounds for Temporal Difference Learning with Neural Network Approximation

no code implementations2 Mar 2021 Semih Cayci, Siddhartha Satpathi, Niao He, R. Srikant

In this paper, we study the dynamics of temporal difference learning with neural network-based value function approximation over a general state space, namely, \emph{Neural TD learning}.

A Discrete-Time Switching System Analysis of Q-learning

no code implementations17 Feb 2021 Donghwan Lee, Jianghai Hu, Niao He

Based on these two systems, we derive a new finite-time error bound of asynchronous Q-learning when a constant stepsize is used.

Q-Learning

Global Convergence and Variance Reduction for a Class of Nonconvex-Nonconcave Minimax Problems

no code implementations NeurIPS 2020 Junchi Yang, Negar Kiyavash, Niao He

Nonconvex minimax problems appear frequently in emerging machine learning applications, such as generative adversarial networks and adversarial learning.

A Unified Switching System Perspective and Convergence Analysis of Q-Learning Algorithms

no code implementations NeurIPS 2020 Donghwan Lee, Niao He

This paper develops a novel and unified framework to analyze the convergence of a large family of Q-learning algorithms from the switching system perspective.

Q-Learning

The Devil is in the Detail: A Framework for Macroscopic Prediction via Microscopic Models

no code implementations NeurIPS 2020 Yingxiang Yang, Negar Kiyavash, Le Song, Niao He

Macroscopic data aggregated from microscopic events are pervasive in machine learning, such as country-level COVID-19 infection statistics based on city-level data.

Stochastic Optimization

A Catalyst Framework for Minimax Optimization

no code implementations NeurIPS 2020 Junchi Yang, Siqi Zhang, Negar Kiyavash, Niao He

We introduce a generic \emph{two-loop} scheme for smooth minimax optimization with strongly-convex-concave objectives.

The Mean-Squared Error of Double Q-Learning

1 code implementation NeurIPS 2020 Wentao Weng, Harsh Gupta, Niao He, Lei Ying, R. Srikant

In this paper, we establish a theoretical comparison between the asymptotic mean-squared error of Double Q-learning and Q-learning.

Q-Learning

Biased Stochastic Gradient Descent for Conditional Stochastic Optimization

no code implementations25 Feb 2020 Yifan Hu, Siqi Zhang, Xin Chen, Niao He

Conditional Stochastic Optimization (CSO) covers a variety of applications ranging from meta-learning and causal inference to invariant learning.

Causal Inference Meta-Learning +2

Periodic Q-Learning

no code implementations L4DC 2020 Donghwan Lee, Niao He

The use of target networks is a common practice in deep reinforcement learning for stabilizing the training; however, theoretical understanding of this technique is still limited.

Q-Learning

Global Convergence and Variance-Reduced Optimization for a Class of Nonconvex-Nonconcave Minimax Problems

no code implementations22 Feb 2020 Junchi Yang, Negar Kiyavash, Niao He

Nonconvex minimax problems appear frequently in emerging machine learning applications, such as generative adversarial networks and adversarial learning.

A Unified Switching System Perspective and O.D.E. Analysis of Q-Learning Algorithms

no code implementations4 Dec 2019 Donghwan Lee, Niao He

In this paper, we introduce a unified framework for analyzing a large family of Q-learning algorithms, based on switching system perspectives and ODE-based stochastic approximation.

Q-Learning

Learning Positive Functions with Pseudo Mirror Descent

no code implementations NeurIPS 2019 Yingxiang Yang, Haoxiang Wang, Negar Kiyavash, Niao He

The nonparametric learning of positive-valued functions appears widely in machine learning, especially in the context of estimating intensity functions of point processes.

Computational Efficiency Point Processes

Optimization for Reinforcement Learning: From Single Agent to Cooperative Agents

no code implementations1 Dec 2019 Donghwan Lee, Niao He, Parameswaran Kamalaruban, Volkan Cevher

This article reviews recent advances in multi-agent reinforcement learning algorithms for large-scale control systems and communication networks, which learn to communicate and cooperate.

Distributed Optimization Multi-agent Reinforcement Learning +2

Sample Complexity of Sample Average Approximation for Conditional Stochastic Optimization

no code implementations28 May 2019 Yifan Hu, Xin Chen, Niao He

In this paper, we study a class of stochastic optimization problems, referred to as the \emph{Conditional Stochastic Optimization} (CSO), in the form of $\min_{x \in \mathcal{X}} \EE_{\xi}f_\xi\Big({\EE_{\eta|\xi}[g_\eta(x,\xi)]}\Big)$, which finds a wide spectrum of applications including portfolio selection, reinforcement learning, robust learning, causal inference and so on.

Causal Inference Stochastic Optimization

Exponential Family Estimation via Adversarial Dynamics Embedding

1 code implementation NeurIPS 2019 Bo Dai, Zhen Liu, Hanjun Dai, Niao He, Arthur Gretton, Le Song, Dale Schuurmans

We present an efficient algorithm for maximum likelihood estimation (MLE) of exponential family models, with a general parametrization of the energy function that includes neural networks.

Target-Based Temporal Difference Learning

no code implementations24 Apr 2019 Donghwan Lee, Niao He

The use of target networks has been a popular and key component of recent deep Q-learning algorithms for reinforcement learning, yet little is known from the theory side.

Q-Learning

Quadratic Decomposable Submodular Function Minimization: Theory and Practice (Computation and Analysis of PageRank over Hypergraphs)

no code implementations26 Feb 2019 Pan Li, Niao He, Olgica Milenkovic

We introduce a new convex optimization problem, termed quadratic decomposable submodular function minimization (QDSFM), which allows to model a number of learning tasks on graphs and hypergraphs.

hypergraph partitioning

Predictive Approximate Bayesian Computation via Saddle Points

no code implementations NeurIPS 2018 Yingxiang Yang, Bo Dai, Negar Kiyavash, Niao He

Approximate Bayesian computation (ABC) is an important methodology for Bayesian inference when the likelihood function is intractable.

Bayesian Inference regression

Coupled Variational Bayes via Optimization Embedding

1 code implementation NeurIPS 2018 Bo Dai, Hanjun Dai, Niao He, Weiyang Liu, Zhen Liu, Jianshu Chen, Lin Xiao, Le Song

This flexible function class couples the variational distribution with the original parameters in the graphical models, allowing end-to-end learning of the graphical models by back-propagation through the variational distribution.

Variational Inference

Kernel Exponential Family Estimation via Doubly Dual Embedding

1 code implementation6 Nov 2018 Bo Dai, Hanjun Dai, Arthur Gretton, Le Song, Dale Schuurmans, Niao He

We investigate penalized maximum log-likelihood estimation for exponential family distributions whose natural parameter resides in a reproducing kernel Hilbert space.

Quadratic Decomposable Submodular Function Minimization

1 code implementation NeurIPS 2018 Pan Li, Niao He, Olgica Milenkovic

The problem is closely related to decomposable submodular function minimization and arises in many learning on graphs and hypergraphs settings, such as graph-based semi-supervised learning and PageRank.

Nonparametric Hawkes Processes: Online Estimation and Generalization Bounds

no code implementations25 Jan 2018 Yingxiang Yang, Jalal Etesami, Niao He, Negar Kiyavash

In this paper, we design a nonparametric online algorithm for estimating the triggering functions of multivariate Hawkes processes.

Generalization Bounds

SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation

no code implementations ICML 2018 Bo Dai, Albert Shaw, Lihong Li, Lin Xiao, Niao He, Zhen Liu, Jianshu Chen, Le Song

When function approximation is used, solving the Bellman optimality equation with stability guarantees has remained a major open problem in reinforcement learning for decades.

Q-Learning reinforcement-learning +1

Boosting the Actor with Dual Critic

no code implementations ICLR 2018 Bo Dai, Albert Shaw, Niao He, Lihong Li, Le Song

This paper proposes a new actor-critic-style algorithm called Dual Actor-Critic or Dual-AC.

Online Learning for Multivariate Hawkes Processes

no code implementations NeurIPS 2017 Yingxiang Yang, Jalal Etesami, Niao He, Negar Kiyavash

We develop a nonparametric and online learning algorithm that estimates the triggering functions of a multivariate Hawkes process (MHP).

Stochastic Generative Hashing

2 code implementations ICML 2017 Bo Dai, Ruiqi Guo, Sanjiv Kumar, Niao He, Le Song

Learning-based binary hashing has become a powerful paradigm for fast search and retrieval in massive databases.

Retrieval

Fast and Simple Optimization for Poisson Likelihood Models

no code implementations3 Aug 2016 Niao He, Zaid Harchaoui, Yichen Wang, Le Song

Since almost all gradient-based optimization algorithms rely on Lipschitz-continuity, optimizing Poisson likelihood models with a guarantee of convergence can be challenging, especially for large-scale problems.

Time Series Time Series Analysis

Learning from Conditional Distributions via Dual Embeddings

no code implementations15 Jul 2016 Bo Dai, Niao He, Yunpeng Pan, Byron Boots, Le Song

In such problems, each sample $x$ itself is associated with a conditional distribution $p(z|x)$ represented by samples $\{z_i\}_{i=1}^M$, and the goal is to learn a function $f$ that links these conditional distributions to target values $y$.

Time-Sensitive Recommendation From Recurrent User Activities

no code implementations NeurIPS 2015 Nan Du, Yichen Wang, Niao He, Jimeng Sun, Le Song

By making personalized suggestions, a recommender system is playing a crucial role in improving the engagement of users in modern web-services.

Point Processes Recommendation Systems

Semi-proximal Mirror-Prox for Nonsmooth Composite Minimization

no code implementations NeurIPS 2015 Niao He, Zaid Harchaoui

We propose a new first-order optimisation algorithm to solve high-dimensional non-smooth composite minimisation problems.

Provable Bayesian Inference via Particle Mirror Descent

no code implementations9 Jun 2015 Bo Dai, Niao He, Hanjun Dai, Le Song

Bayesian methods are appealing in their flexibility in modeling complex data and ability in capturing uncertainty in parameters.

Bayesian Inference Gaussian Processes

Scalable Kernel Methods via Doubly Stochastic Gradients

1 code implementation NeurIPS 2014 Bo Dai, Bo Xie, Niao He, YIngyu Liang, Anant Raj, Maria-Florina Balcan, Le Song

The general perception is that kernel methods are not scalable, and neural nets are the methods of choice for nonlinear learning problems.

Cannot find the paper you are looking for? You can Submit a new open access paper.