Search Results for author: Chi Jin

Found 66 papers, 6 papers with code

Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition

no code implementations ICML 2020 Chi Jin, Tiancheng Jin, Haipeng Luo, Suvrit Sra, Tiancheng Yu

We consider the task of learning in episodic finite-horizon Markov decision processes with an unknown transition function, bandit feedback, and adversarial losses.

DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method

no code implementations25 May 2023 Ahmed Khaled, Konstantin Mishchenko, Chi Jin

It is also the first parameter-free AdaGrad style algorithm that adapts to smooth optimization.

Optimistic Natural Policy Gradient: a Simple Efficient Policy Optimization Framework for Online RL

no code implementations18 May 2023 Qinghua Liu, Gellért Weisz, András György, Chi Jin, Csaba Szepesvári

While policy optimization algorithms have played an important role in recent empirical success of Reinforcement Learning (RL), the existing theoretical understanding of policy optimization remains rather limited -- they are either restricted to tabular MDPs or suffer from highly suboptimal sample complexity, especial in online RL where exploration is necessary.

Reinforcement Learning (RL)

Learning a Universal Human Prior for Dexterous Manipulation from Human Preference

no code implementations10 Apr 2023 Zihan Ding, Yuanpei Chen, Allen Z. Ren, Shixiang Shane Gu, Hao Dong, Chi Jin

Generating human-like behavior on robots is a great challenge especially in dexterous manipulation tasks with robotic hands.

Robot Manipulation

On the Provable Advantage of Unsupervised Pretraining

no code implementations2 Mar 2023 Jiawei Ge, Shange Tang, Jianqing Fan, Chi Jin

Unsupervised pretraining, which learns a useful representation using a large amount of unlabeled data to facilitate the learning of downstream tasks, is a critical component of modern large-scale machine learning systems.

Contrastive Learning Representation Learning

Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function Approximation

no code implementations13 Feb 2023 Yuanhao Wang, Qinghua Liu, Yu Bai, Chi Jin

A unique challenge in Multi-Agent Reinforcement Learning (MARL) is the curse of multiagency, where the description length of the game as well as the complexity of many existing learning algorithms scale exponentially with the number of agents.

Multi-agent Reinforcement Learning

Efficient displacement convex optimization with particle gradient descent

no code implementations9 Feb 2023 Hadi Daneshmand, Jason D. Lee, Chi Jin

Particle gradient descent, which uses particles to represent a probability measure and performs gradient descent on particles in parallel, is widely used to optimize functions of probability measures.

Representation Learning for General-sum Low-rank Markov Games

no code implementations30 Oct 2022 Chengzhuo Ni, Yuda Song, Xuezhou Zhang, Chi Jin, Mengdi Wang

To our best knowledge, this is the first sample-efficient algorithm for multi-agent general-sum Markov games that incorporates (non-linear) function approximation.

Representation Learning

Provable Sim-to-real Transfer in Continuous Domain with Partial Observations

no code implementations27 Oct 2022 Jiachen Hu, Han Zhong, Chi Jin, LiWei Wang

Sim-to-real transfer trains RL agents in the simulated environments and then deploys them in the real world.

Learning Rationalizable Equilibria in Multiplayer Games

no code implementations20 Oct 2022 Yuanhao Wang, Dingwen Kong, Yu Bai, Chi Jin

This paper develops the first line of efficient algorithms for learning rationalizable Coarse Correlated Equilibria (CCE) and Correlated Equilibria (CE) whose sample complexities are polynomial in all problem parameters including the number of players.

Optimistic MLE -- A Generic Model-based Algorithm for Partially Observable Sequential Decision Making

no code implementations29 Sep 2022 Qinghua Liu, Praneeth Netrapalli, Csaba Szepesvári, Chi Jin

We prove that OMLE learns the near-optimal policies of an enormously rich class of sequential decision making problems in a polynomial number of samples.

Algorithm Decision Making +2

Faster federated optimization under second-order similarity

no code implementations6 Sep 2022 Ahmed Khaled, Chi Jin

Federated learning (FL) is a subfield of machine learning where multiple clients try to collaboratively learn a model over a network under communication constraints.

Federated Learning

A Deep Reinforcement Learning Approach for Finding Non-Exploitable Strategies in Two-Player Atari Games

2 code implementations18 Jul 2022 Zihan Ding, DiJia Su, Qinghua Liu, Chi Jin

This paper proposes new, end-to-end deep reinforcement learning algorithms for learning two-player zero-sum Markov games.

Atari Games Q-Learning

Sample-Efficient Reinforcement Learning of Partially Observable Markov Games

no code implementations2 Jun 2022 Qinghua Liu, Csaba Szepesvári, Chi Jin

This paper considers the challenging tasks of Multi-Agent Reinforcement Learning (MARL) under partial observability, where each agent only sees her own individual observations and actions that reveal incomplete information about the underlying state of system.

Multi-agent Reinforcement Learning reinforcement-learning +1

Efficient Phi-Regret Minimization in Extensive-Form Games via Online Mirror Descent

no code implementations30 May 2022 Yu Bai, Chi Jin, Song Mei, Ziang Song, Tiancheng Yu

A conceptually appealing approach for learning Extensive-Form Games (EFGs) is to convert them to Normal-Form Games (NFGs).

When Is Partially Observable Reinforcement Learning Not Scary?

no code implementations19 Apr 2022 Qinghua Liu, Alan Chung, Csaba Szepesvári, Chi Jin

Applications of Reinforcement Learning (RL), in which agents learn to make a sequence of decisions despite lacking complete information about the latent states of the controlled system, that is, they act under partial observability of the states, are ubiquitous.

Partially Observable Reinforcement Learning reinforcement-learning +1

Learning Markov Games with Adversarial Opponents: Efficient Algorithms and Fundamental Limits

no code implementations14 Mar 2022 Qinghua Liu, Yuanhao Wang, Chi Jin

When the policies of the opponents are not revealed, we prove a statistical hardness result even in the most favorable scenario when both above conditions are true.

Provable Reinforcement Learning with a Short-Term Memory

no code implementations8 Feb 2022 Yonathan Efroni, Chi Jin, Akshay Krishnamurthy, Sobhan Miryoosefi

Real-world sequential decision making problems commonly involve partial observability, which requires the agent to maintain a memory of history in order to infer the latent states, plan and make good decisions.

Decision Making reinforcement-learning +1

Near-Optimal Learning of Extensive-Form Games with Imperfect Information

no code implementations3 Feb 2022 Yu Bai, Chi Jin, Song Mei, Tiancheng Yu

This improves upon the best known sample complexity of $\widetilde{\mathcal{O}}((X^2A+Y^2B)/\varepsilon^2)$ by a factor of $\widetilde{\mathcal{O}}(\max\{X, Y\})$, and matches the information-theoretic lower bound up to logarithmic factors.

Globally convergent visual-feature range estimation with biased inertial measurements

no code implementations23 Dec 2021 Bowen Yi, Chi Jin, Ian R. Manchester

The design of a globally convergent position observer for feature points from visual information is a challenging problem, especially for the case with only inertial measurements and without assumptions of uniform observability, which remained open for a long time.

V-Learning -- A Simple, Efficient, Decentralized Algorithm for Multiagent RL

no code implementations27 Oct 2021 Chi Jin, Qinghua Liu, Yuanhao Wang, Tiancheng Yu

We design a new class of fully decentralized algorithms -- V-learning, which provably learns Nash equilibria (in the two-player zero-sum setting), correlated equilibria and coarse correlated equilibria (in the multiplayer general-sum setting) in a number of samples that only scales with $\max_{i\in[m]} A_i$, where $A_i$ is the number of actions for the $i^{\rm th}$ player.

Algorithm Medical Visual Question Answering +1

A Simple Reward-free Approach to Constrained Reinforcement Learning

no code implementations12 Jul 2021 Sobhan Miryoosefi, Chi Jin

In constrained reinforcement learning (RL), a learning agent seeks to not only optimize the overall reward but also satisfy the additional safety, diversity, or budget constraints.

reinforcement-learning Reinforcement Learning (RL)

The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces

no code implementations7 Jun 2021 Chi Jin, Qinghua Liu, Tiancheng Yu

Modern reinforcement learning (RL) commonly engages practical problems with large state spaces, where function approximation must be deployed to approximate either the value function or the policy.

Reinforcement Learning (RL)

Minimax Optimization with Smooth Algorithmic Adversaries

1 code implementation ICLR 2022 Tanner Fiez, Chi Jin, Praneeth Netrapalli, Lillian J. Ratliff

This paper considers minimax optimization $\min_x \max_y f(x, y)$ in the challenging setting where $f$ can be both nonconvex in $x$ and nonconcave in $y$.

Risk Bounds and Rademacher Complexity in Batch Reinforcement Learning

no code implementations25 Mar 2021 Yaqi Duan, Chi Jin, Zhiyuan Li

Concretely, we view the Bellman error as a surrogate loss for the optimality gap, and prove the followings: (1) In double sampling regime, the excess risk of Empirical Risk Minimizer (ERM) is bounded by the Rademacher complexity of the function class.

Learning Theory reinforcement-learning +1

Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games

no code implementations NeurIPS 2021 Yu Bai, Chi Jin, Huan Wang, Caiming Xiong

Real world applications such as economics and policy making often involve solving multi-agent games with two unique features: (1) The agents are inherently asymmetric and partitioned into leaders and followers; (2) The agents have different reward functions, thus the game is general-sum.

Near-optimal Representation Learning for Linear Bandits and Linear RL

no code implementations8 Feb 2021 Jiachen Hu, Xiaoyu Chen, Chi Jin, Lihong Li, LiWei Wang

This paper studies representation learning for multi-task linear bandits and multi-task episodic RL with linear value function approximation.

Representation Learning

A Local Convergence Theory for Mildly Over-Parameterized Two-Layer Neural Network

no code implementations4 Feb 2021 Mo Zhou, Rong Ge, Chi Jin

We show that as long as the loss is already lower than a threshold (polynomial in relevant parameters), all student neurons in an over-parameterized two-layer neural network will converge to one of teacher neurons, and the loss will go to 0.

Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms

no code implementations NeurIPS 2021 Chi Jin, Qinghua Liu, Sobhan Miryoosefi

Finding the minimal structural assumptions that empower sample-efficient learning is one of the most important research directions in Reinforcement Learning (RL).

Reinforcement Learning (RL)

Provable Rich Observation Reinforcement Learning with Combinatorial Latent States

no code implementations ICLR 2021 Dipendra Misra, Qinghua Liu, Chi Jin, John Langford

We propose a novel setting for reinforcement learning that combines two common real-world difficulties: presence of observations (such as camera images) and factored states (such as location of objects).

Contrastive Learning reinforcement-learning +1

Provably Efficient Reinforcement Learning with Kernel and Neural Function Approximations

no code implementations NeurIPS 2020 Zhuoran Yang, Chi Jin, Zhaoran Wang, Mengdi Wang, Michael Jordan

Reinforcement learning (RL) algorithms combined with modern function approximators such as kernel functions and deep neural networks have achieved significant empirical successes in large-scale application problems with a massive number of states.

reinforcement-learning Reinforcement Learning (RL)

On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces

no code implementations9 Nov 2020 Zhuoran Yang, Chi Jin, Zhaoran Wang, Mengdi Wang, Michael I. Jordan

The classical theory of reinforcement learning (RL) has focused on tabular and linear representations of value functions.

Reinforcement Learning (RL)

A Sharp Analysis of Model-based Reinforcement Learning with Self-Play

no code implementations4 Oct 2020 Qinghua Liu, Tiancheng Yu, Yu Bai, Chi Jin

However, for multi-agent reinforcement learning in Markov games, the current best known sample complexity for model-based algorithms is rather suboptimal and compares unfavorably against recent model-free approaches.

Model-based Reinforcement Learning Multi-agent Reinforcement Learning +2

Sample-Efficient Reinforcement Learning of Undercomplete POMDPs

no code implementations NeurIPS 2020 Chi Jin, Sham M. Kakade, Akshay Krishnamurthy, Qinghua Liu

Partial observability is a common challenge in many reinforcement learning applications, which requires an agent to maintain memory, infer latent states, and integrate this past information into exploration.

reinforcement-learning Reinforcement Learning (RL)

Near-Optimal Reinforcement Learning with Self-Play

no code implementations NeurIPS 2020 Yu Bai, Chi Jin, Tiancheng Yu

This paper considers the problem of designing optimal algorithms for reinforcement learning in two-player zero-sum games.

Q-Learning reinforcement-learning +1

On the Theory of Transfer Learning: The Importance of Task Diversity

no code implementations NeurIPS 2020 Nilesh Tripuraneni, Michael. I. Jordan, Chi Jin

Formally, we consider $t+1$ tasks parameterized by functions of the form $f_j \circ h$ in a general function class $\mathcal{F} \circ \mathcal{H}$, where each $f_j$ is a task-specific function in $\mathcal{F}$ and $h$ is the shared representation in $\mathcal{H}$.

Representation Learning Transfer Learning

Provable Meta-Learning of Linear Representations

1 code implementation26 Feb 2020 Nilesh Tripuraneni, Chi Jin, Michael. I. Jordan

In this paper, we focus on the problem of multi-task linear regression -- in which multiple linear regression models share a common, low-dimensional linear representation.

Meta-Learning regression +1

Provable Self-Play Algorithms for Competitive Reinforcement Learning

no code implementations ICML 2020 Yu Bai, Chi Jin

We introduce a self-play algorithm---Value Iteration with Upper/Lower Confidence Bound (VI-ULCB)---and show that it achieves regret $\tilde{\mathcal{O}}(\sqrt{T})$ after playing $T$ steps of the game, where the regret is measured by the agent's performance against a \emph{fully adversarial} opponent who can exploit the agent's strategy at \emph{any} step.

reinforcement-learning Reinforcement Learning (RL)

Reward-Free Exploration for Reinforcement Learning

no code implementations ICML 2020 Chi Jin, Akshay Krishnamurthy, Max Simchowitz, Tiancheng Yu

We give an efficient algorithm that conducts $\tilde{\mathcal{O}}(S^2A\mathrm{poly}(H)/\epsilon^2)$ episodes of exploration and returns $\epsilon$-suboptimal policies for an arbitrary number of reward functions.

reinforcement-learning Reinforcement Learning (RL)

Near-Optimal Algorithms for Minimax Optimization

no code implementations5 Feb 2020 Tianyi Lin, Chi Jin, Michael. I. Jordan

This paper presents the first algorithm with $\tilde{O}(\sqrt{\kappa_{\mathbf x}\kappa_{\mathbf y}})$ gradient complexity, matching the lower bound up to logarithmic factors.

Provably Efficient Exploration in Policy Optimization

no code implementations ICML 2020 Qi Cai, Zhuoran Yang, Chi Jin, Zhaoran Wang

While policy-based reinforcement learning (RL) achieves tremendous successes in practice, it is significantly less understood in theory, especially compared with value-based RL.

Efficient Exploration Reinforcement Learning (RL)

Learning Adversarial MDPs with Bandit Feedback and Unknown Transition

no code implementations3 Dec 2019 Chi Jin, Tiancheng Jin, Haipeng Luo, Suvrit Sra, Tiancheng Yu

We consider the problem of learning in episodic finite-horizon Markov decision processes with an unknown transition function, bandit feedback, and adversarial losses.

Provably Efficient Reinforcement Learning with Linear Function Approximation

2 code implementations11 Jul 2019 Chi Jin, Zhuoran Yang, Zhaoran Wang, Michael. I. Jordan

Modern Reinforcement Learning (RL) is commonly applied to practical problems with an enormous number of states, where function approximation must be deployed to approximate either the value function or the policy.

reinforcement-learning Reinforcement Learning (RL)

On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems

no code implementations ICML 2020 Tianyi Lin, Chi Jin, Michael. I. Jordan

We consider nonconvex-concave minimax problems, $\min_{\mathbf{x}} \max_{\mathbf{y} \in \mathcal{Y}} f(\mathbf{x}, \mathbf{y})$, where $f$ is nonconvex in $\mathbf{x}$ but concave in $\mathbf{y}$ and $\mathcal{Y}$ is a convex and bounded set.

On Nonconvex Optimization for Machine Learning: Gradients, Stochasticity, and Saddle Points

no code implementations13 Feb 2019 Chi Jin, Praneeth Netrapalli, Rong Ge, Sham M. Kakade, Michael. I. Jordan

More recent theory has shown that GD and SGD can avoid saddle points, but the dependence on dimension in these analyses is polynomial.

BIG-bench Machine Learning

A Short Note on Concentration Inequalities for Random Vectors with SubGaussian Norm

no code implementations11 Feb 2019 Chi Jin, Praneeth Netrapalli, Rong Ge, Sham M. Kakade, Michael. I. Jordan

In this note, we derive concentration inequalities for random vectors with subGaussian norm (a generalization of both subGaussian random vectors and norm bounded random vectors), which are tight up to logarithmic factors.

What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization?

1 code implementation ICML 2020 Chi Jin, Praneeth Netrapalli, Michael. I. Jordan

Minimax optimization has found extensive applications in modern machine learning, in settings such as generative adversarial networks (GANs), adversarial training and multi-agent reinforcement learning.

BIG-bench Machine Learning Multi-agent Reinforcement Learning

Sampling Can Be Faster Than Optimization

no code implementations20 Nov 2018 Yi-An Ma, Yuansi Chen, Chi Jin, Nicolas Flammarion, Michael. I. Jordan

Optimization algorithms and Monte Carlo sampling algorithms have provided the computational foundations for the rapid growth in applications of statistical machine learning in recent years.

Is Q-learning Provably Efficient?

no code implementations NeurIPS 2018 Chi Jin, Zeyuan Allen-Zhu, Sebastien Bubeck, Michael. I. Jordan

We prove that, in an episodic MDP setting, Q-learning with UCB exploration achieves regret $\tilde{O}(\sqrt{H^3 SAT})$, where $S$ and $A$ are the numbers of states and actions, $H$ is the number of steps per episode, and $T$ is the total number of steps.

Q-Learning Reinforcement Learning (RL)

Stability and Convergence Trade-off of Iterative Optimization Algorithms

no code implementations4 Apr 2018 Yuansi Chen, Chi Jin, Bin Yu

Applying existing stability upper bounds for the gradient methods in our trade-off framework, we obtain lower bounds matching the well-established convergence upper bounds up to constants for these algorithms and conjecture similar lower bounds for NAG and HB.

On the Local Minima of the Empirical Risk

no code implementations NeurIPS 2018 Chi Jin, Lydia T. Liu, Rong Ge, Michael. I. Jordan

Our objective is to find the $\epsilon$-approximate local minima of the underlying function $F$ while avoiding the shallow local minima---arising because of the tolerance $\nu$---which exist only in $f$.

Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent

no code implementations28 Nov 2017 Chi Jin, Praneeth Netrapalli, Michael. I. Jordan

Nesterov's accelerated gradient descent (AGD), an instance of the general family of "momentum methods", provably achieves faster convergence rate than gradient descent (GD) in the convex setting.

Stochastic Cubic Regularization for Fast Nonconvex Optimization

no code implementations NeurIPS 2018 Nilesh Tripuraneni, Mitchell Stern, Chi Jin, Jeffrey Regier, Michael. I. Jordan

This paper proposes a stochastic variant of a classic algorithm---the cubic-regularized Newton method [Nesterov and Polyak 2006].

Gradient Descent Can Take Exponential Time to Escape Saddle Points

no code implementations NeurIPS 2017 Simon S. Du, Chi Jin, Jason D. Lee, Michael. I. Jordan, Barnabas Poczos, Aarti Singh

Although gradient descent (GD) almost always escapes saddle points asymptotically [Lee et al., 2016], this paper shows that even with fairly natural random initialization schemes and non-pathological functions, GD can be significantly slowed down by saddle points, taking exponential time to escape.

No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis

no code implementations ICML 2017 Rong Ge, Chi Jin, Yi Zheng

In this paper we develop a new framework that captures the common landscape underlying the common non-convex low-rank matrix problems including matrix sensing, matrix completion and robust PCA.

Matrix Completion

How to Escape Saddle Points Efficiently

no code implementations ICML 2017 Chi Jin, Rong Ge, Praneeth Netrapalli, Sham M. Kakade, Michael. I. Jordan

This paper shows that a perturbed form of gradient descent converges to a second-order stationary point in a number iterations which depends only poly-logarithmically on dimension (i. e., it is almost "dimension-free").

Local Maxima in the Likelihood of Gaussian Mixture Models: Structural Results and Algorithmic Consequences

no code implementations NeurIPS 2016 Chi Jin, Yuchen Zhang, Sivaraman Balakrishnan, Martin J. Wainwright, Michael Jordan

Our first main result shows that the population likelihood function has bad local maxima even in the special case of equally-weighted mixtures of well-separated and spherical Gaussians.

Faster Eigenvector Computation via Shift-and-Invert Preconditioning

no code implementations26 May 2016 Dan Garber, Elad Hazan, Chi Jin, Sham M. Kakade, Cameron Musco, Praneeth Netrapalli, Aaron Sidford

We give faster algorithms and improved sample complexities for estimating the top eigenvector of a matrix $\Sigma$ -- i. e. computing a unit vector $x$ such that $x^T \Sigma x \ge (1-\epsilon)\lambda_1(\Sigma)$: Offline Eigenvector Estimation: Given an explicit $A \in \mathbb{R}^{n \times d}$ with $\Sigma = A^TA$, we show how to compute an $\epsilon$ approximate top eigenvector in time $\tilde O([nnz(A) + \frac{d*sr(A)}{gap^2} ]* \log 1/\epsilon )$ and $\tilde O([\frac{nnz(A)^{3/4} (d*sr(A))^{1/4}}{\sqrt{gap}} ] * \log 1/\epsilon )$.

Stochastic Optimization

Provable Efficient Online Matrix Completion via Non-convex Stochastic Gradient Descent

no code implementations NeurIPS 2016 Chi Jin, Sham M. Kakade, Praneeth Netrapalli

While existing algorithms are efficient for the offline setting, they could be highly inefficient for the online setting.

Matrix Completion

Robust Shift-and-Invert Preconditioning: Faster and More Sample Efficient Algorithms for Eigenvector Computation

no code implementations29 Oct 2015 Chi Jin, Sham M. Kakade, Cameron Musco, Praneeth Netrapalli, Aaron Sidford

Combining our algorithm with previous work to initialize $x_0$, we obtain a number of improved sample complexity and runtime results.

Stochastic Optimization

Escaping From Saddle Points --- Online Stochastic Gradient for Tensor Decomposition

1 code implementation6 Mar 2015 Rong Ge, Furong Huang, Chi Jin, Yang Yuan

To the best of our knowledge this is the first work that gives global convergence guarantees for stochastic gradient descent on non-convex functions with exponentially many local minima and saddle points.

Tensor Decomposition

Differentially Private Data Releasing for Smooth Queries with Synthetic Database Output

no code implementations6 Jan 2014 Chi Jin, Ziteng Wang, Junliang Huang, Yiqiao Zhong, Li-Wei Wang

We develop an $\epsilon$-differentially private mechanism for the class of $K$-smooth queries.

Dimensionality Dependent PAC-Bayes Margin Bound

no code implementations NeurIPS 2012 Chi Jin, Li-Wei Wang

We show that our bound is strictly sharper than a previously well-known PAC-Bayes margin bound if the feature space is of finite dimension; and the two bounds tend to be equivalent as the dimension goes to infinity.

Model Selection

Cannot find the paper you are looking for? You can Submit a new open access paper.