1 code implementation • 26 Feb 2023 • Kei Ishikawa, Niao He
It can be shown that our estimator contains the recently proposed sharp estimator by Dorn and Guo (2022) as a special case, and our method enables a novel extension of the classical marginal sensitivity model using f-divergence.
no code implementations • 10 Feb 2023 • Jiawei Huang, Niao He
In this paper, we study the Tiered Reinforcement Learning setting, a parallel transfer learning framework, where the goal is to transfer knowledge from the low-tier (source) task to the high-tier (target) task to reduce the exploration risk of the latter while solving the two tasks in parallel.
no code implementations • 3 Feb 2023 • Ilyas Fatkhullin, Anas Barakat, Anastasia Kireeva, Niao He
Recently, the impressive empirical success of policy gradient (PG) methods has catalyzed the development of their theoretical foundations.
no code implementations • 29 Dec 2022 • Batuhan Yardim, Semih Cayci, Matthieu Geist, Niao He
Instead, we show that $N$ agents running policy mirror ascent converge to the Nash equilibrium of the regularized game within $\tilde{\mathcal{O}}(\varepsilon^{-2})$ samples from a single sample trajectory without a population generative model, up to a standard $\mathcal{O}(\frac{1}{\sqrt{N}})$ error due to the mean field.
no code implementations • 14 Nov 2022 • Hanjun Dai, Yuan Xue, Niao He, Bethany Wang, Na Li, Dale Schuurmans, Bo Dai
In real-world decision-making, uncertainty is important yet difficult to handle.
no code implementations • 31 Oct 2022 • Xiang Li, Junchi Yang, Niao He
Adaptive gradient methods have shown their ability to adjust the stepsizes on the fly in a parameter-agnostic manner, and empirically achieve faster convergence for solving minimization problems.
no code implementations • 2 Jun 2022 • Semih Cayci, Niao He, R. Srikant
Natural actor-critic (NAC) and its variants, equipped with the representation power of neural networks, have demonstrated impressive empirical success in solving Markov decision problems with large state spaces.
no code implementations • 1 Jun 2022 • Junchi Yang, Xiang Li, Niao He
Adaptive algorithms like AdaGrad and AMSGrad are successful in nonconvex optimization owing to their parameter-agnostic ability -- requiring no a priori knowledge about problem-specific parameters nor tuning of learning rates.
no code implementations • 1 Jun 2022 • Liang Zhang, Kiran Koshy Thekumparampil, Sewoong Oh, Niao He
We provide a general framework for solving differentially private stochastic minimax optimization (DP-SMO) problems, which enables the practitioners to bring their own base optimization algorithm and use it as a black-box to obtain the near-optimal privacy-loss trade-off.
no code implementations • 28 May 2022 • Siqi Zhang, Yifan Hu, Liang Zhang, Niao He
We further study the algorithm-dependent generalization bounds via stability arguments of algorithms.
no code implementations • 25 May 2022 • Saeed Masiha, Saber Salehkaleybar, Niao He, Negar Kiyavash, Patrick Thiran
We prove that the total sample complexity of SCRN in achieving $\epsilon$-global optimum is $\mathcal{O}(\epsilon^{-7/(2\alpha)+1})$ for $1\le\alpha< 3/2$ and $\mathcal{\tilde{O}}(\epsilon^{-2/(\alpha)})$ for $3/2\le\alpha\le 2$.
no code implementations • 17 May 2022 • Saber Salehkaleybar, Sadegh Khorasani, Negar Kiyavash, Niao He, Patrick Thiran
SHARP algorithm is parameter-free, achieving $\epsilon$-approximate first-order stationary point with $O(\epsilon^{-3})$ number of trajectories, while using a batch size of $O(1)$ at each iteration.
no code implementations • 20 Feb 2022 • Semih Cayci, Niao He, R. Srikant
We consider the reinforcement learning problem for partially observed Markov decision processes (POMDPs) with large or even countably infinite state spaces, where the controller has access to only noisy observations of the underlying controlled Markov chain.
no code implementations • 19 Jan 2022 • Kiran Koshy Thekumparampil, Niao He, Sewoong Oh
We also provide a direct single-loop algorithm, using the LPD method, that achieves the iteration complexity of $O(\sqrt{\frac{L_x}{\varepsilon}} + \frac{\|A\|}{\sqrt{\mu_y \varepsilon}} + \sqrt{\frac{L_y}{\varepsilon}})$.
1 code implementation • 10 Dec 2021 • Junchi Yang, Antonio Orvieto, Aurelien Lucchi, Niao He
Gradient descent ascent (GDA), the simplest single-loop algorithm for nonconvex minimax optimization, is widely used in practical applications such as generative adversarial networks (GANs) and adversarial training.
no code implementations • NeurIPS 2021 • Yifan Hu, Xin Chen, Niao He
We consider stochastic optimization when one only has access to biased stochastic oracles of the objective, and obtaining stochastic gradients with low biases comes at high costs.
no code implementations • 29 Sep 2021 • Ahmet Alacaoglu, Luca Viano, Niao He, Volkan Cevher
Our sample complexities also match the best-known results for global convergence of policy gradient and two time-scale actor-critic algorithms in the single agent setting.
no code implementations • 29 Sep 2021 • Jun Song, Chaoyue Zhao, Niao He
Trust-region methods based on Kullback-Leibler divergence are pervasively used to stabilize policy optimization in reinforcement learning.
no code implementations • 8 Jun 2021 • Semih Cayci, Niao He, R. Srikant
Furthermore, under mild regularity conditions on the concentrability coefficient and basis vectors, we prove that entropy-regularized NPG exhibits \emph{linear convergence} up to a function approximation error.
no code implementations • 29 Mar 2021 • Siqi Zhang, Junchi Yang, Cristóbal Guzmán, Negar Kiyavash, Niao He
In the averaged smooth finite-sum setting, our proposed algorithm improves over previous algorithms by providing a nearly-tight dependence on the condition number.
no code implementations • 14 Mar 2021 • Donghwan Lee, Niao He, Seungjae Lee, Panagiota Karava, Jianghai Hu
The building sector consumes the largest energy in the world, and there have been considerable research interests in energy consumption and comfort management of buildings.
no code implementations • 2 Mar 2021 • Semih Cayci, Siddhartha Satpathi, Niao He, R. Srikant
In this paper, we study the dynamics of temporal difference learning with neural network-based value function approximation over a general state space, namely, \emph{Neural TD learning}.
no code implementations • 17 Feb 2021 • Donghwan Lee, Jianghai Hu, Niao He
Based on these two systems, we derive a new finite-time error bound of asynchronous Q-learning when a constant stepsize is used.
no code implementations • NeurIPS 2020 • Junchi Yang, Negar Kiyavash, Niao He
Nonconvex minimax problems appear frequently in emerging machine learning applications, such as generative adversarial networks and adversarial learning.
no code implementations • NeurIPS 2020 • Donghwan Lee, Niao He
This paper develops a novel and unified framework to analyze the convergence of a large family of Q-learning algorithms from the switching system perspective.
no code implementations • NeurIPS 2020 • Yingxiang Yang, Negar Kiyavash, Le Song, Niao He
Macroscopic data aggregated from microscopic events are pervasive in machine learning, such as country-level COVID-19 infection statistics based on city-level data.
no code implementations • NeurIPS 2020 • Junchi Yang, Siqi Zhang, Negar Kiyavash, Niao He
We introduce a generic \emph{two-loop} scheme for smooth minimax optimization with strongly-convex-concave objectives.
no code implementations • NeurIPS 2020 • Yifan Hu, Siqi Zhang, Xin Chen, Niao He
Conditional stochastic optimization covers a variety of applications ranging from invariant learning and causal inference to meta-learning.
1 code implementation • NeurIPS 2020 • Wentao Weng, Harsh Gupta, Niao He, Lei Ying, R. Srikant
In this paper, we establish a theoretical comparison between the asymptotic mean-squared error of Double Q-learning and Q-learning.
no code implementations • 25 Feb 2020 • Yifan Hu, Siqi Zhang, Xin Chen, Niao He
Conditional Stochastic Optimization (CSO) covers a variety of applications ranging from meta-learning and causal inference to invariant learning.
no code implementations • L4DC 2020 • Donghwan Lee, Niao He
The use of target networks is a common practice in deep reinforcement learning for stabilizing the training; however, theoretical understanding of this technique is still limited.
no code implementations • 22 Feb 2020 • Junchi Yang, Negar Kiyavash, Niao He
Nonconvex minimax problems appear frequently in emerging machine learning applications, such as generative adversarial networks and adversarial learning.
no code implementations • 4 Dec 2019 • Donghwan Lee, Niao He
In this paper, we introduce a unified framework for analyzing a large family of Q-learning algorithms, based on switching system perspectives and ODE-based stochastic approximation.
no code implementations • NeurIPS 2019 • Yingxiang Yang, Haoxiang Wang, Negar Kiyavash, Niao He
The nonparametric learning of positive-valued functions appears widely in machine learning, especially in the context of estimating intensity functions of point processes.
no code implementations • 1 Dec 2019 • Donghwan Lee, Niao He, Parameswaran Kamalaruban, Volkan Cevher
This article reviews recent advances in multi-agent reinforcement learning algorithms for large-scale control systems and communication networks, which learn to communicate and cooperate.
Distributed Optimization
Multi-agent Reinforcement Learning
+2
no code implementations • 28 May 2019 • Yifan Hu, Xin Chen, Niao He
In this paper, we study a class of stochastic optimization problems, referred to as the \emph{Conditional Stochastic Optimization} (CSO), in the form of $\min_{x \in \mathcal{X}} \EE_{\xi}f_\xi\Big({\EE_{\eta|\xi}[g_\eta(x,\xi)]}\Big)$, which finds a wide spectrum of applications including portfolio selection, reinforcement learning, robust learning, causal inference and so on.
1 code implementation • NeurIPS 2019 • Bo Dai, Zhen Liu, Hanjun Dai, Niao He, Arthur Gretton, Le Song, Dale Schuurmans
We present an efficient algorithm for maximum likelihood estimation (MLE) of exponential family models, with a general parametrization of the energy function that includes neural networks.
no code implementations • 24 Apr 2019 • Donghwan Lee, Niao He
The use of target networks has been a popular and key component of recent deep Q-learning algorithms for reinforcement learning, yet little is known from the theory side.
no code implementations • 26 Feb 2019 • Pan Li, Niao He, Olgica Milenkovic
We introduce a new convex optimization problem, termed quadratic decomposable submodular function minimization (QDSFM), which allows to model a number of learning tasks on graphs and hypergraphs.
no code implementations • NeurIPS 2018 • Yingxiang Yang, Bo Dai, Negar Kiyavash, Niao He
Approximate Bayesian computation (ABC) is an important methodology for Bayesian inference when the likelihood function is intractable.
1 code implementation • NeurIPS 2018 • Bo Dai, Hanjun Dai, Niao He, Weiyang Liu, Zhen Liu, Jianshu Chen, Lin Xiao, Le Song
This flexible function class couples the variational distribution with the original parameters in the graphical models, allowing end-to-end learning of the graphical models by back-propagation through the variational distribution.
1 code implementation • 6 Nov 2018 • Bo Dai, Hanjun Dai, Arthur Gretton, Le Song, Dale Schuurmans, Niao He
We investigate penalized maximum log-likelihood estimation for exponential family distributions whose natural parameter resides in a reproducing kernel Hilbert space.
1 code implementation • NeurIPS 2018 • Pan Li, Niao He, Olgica Milenkovic
The problem is closely related to decomposable submodular function minimization and arises in many learning on graphs and hypergraphs settings, such as graph-based semi-supervised learning and PageRank.
no code implementations • 25 Jan 2018 • Yingxiang Yang, Jalal Etesami, Niao He, Negar Kiyavash
In this paper, we design a nonparametric online algorithm for estimating the triggering functions of multivariate Hawkes processes.
no code implementations • ICLR 2018 • Bo Dai, Albert Shaw, Niao He, Lihong Li, Le Song
This paper proposes a new actor-critic-style algorithm called Dual Actor-Critic or Dual-AC.
no code implementations • ICML 2018 • Bo Dai, Albert Shaw, Lihong Li, Lin Xiao, Niao He, Zhen Liu, Jianshu Chen, Le Song
When function approximation is used, solving the Bellman optimality equation with stability guarantees has remained a major open problem in reinforcement learning for decades.
no code implementations • NeurIPS 2017 • Yingxiang Yang, Jalal Etesami, Niao He, Negar Kiyavash
We develop a nonparametric and online learning algorithm that estimates the triggering functions of a multivariate Hawkes process (MHP).
2 code implementations • ICML 2017 • Bo Dai, Ruiqi Guo, Sanjiv Kumar, Niao He, Le Song
Learning-based binary hashing has become a powerful paradigm for fast search and retrieval in massive databases.
no code implementations • 3 Aug 2016 • Niao He, Zaid Harchaoui, Yichen Wang, Le Song
Since almost all gradient-based optimization algorithms rely on Lipschitz-continuity, optimizing Poisson likelihood models with a guarantee of convergence can be challenging, especially for large-scale problems.
no code implementations • 15 Jul 2016 • Bo Dai, Niao He, Yunpeng Pan, Byron Boots, Le Song
In such problems, each sample $x$ itself is associated with a conditional distribution $p(z|x)$ represented by samples $\{z_i\}_{i=1}^M$, and the goal is to learn a function $f$ that links these conditional distributions to target values $y$.
no code implementations • NeurIPS 2015 • Nan Du, Yichen Wang, Niao He, Jimeng Sun, Le Song
By making personalized suggestions, a recommender system is playing a crucial role in improving the engagement of users in modern web-services.
no code implementations • NeurIPS 2015 • Niao He, Zaid Harchaoui
We propose a new first-order optimisation algorithm to solve high-dimensional non-smooth composite minimisation problems.
no code implementations • 9 Jun 2015 • Bo Dai, Niao He, Hanjun Dai, Le Song
Bayesian methods are appealing in their flexibility in modeling complex data and ability in capturing uncertainty in parameters.
1 code implementation • NeurIPS 2014 • Bo Dai, Bo Xie, Niao He, YIngyu Liang, Anant Raj, Maria-Florina Balcan, Le Song
The general perception is that kernel methods are not scalable, and neural nets are the methods of choice for nonlinear learning problems.