no code implementations • ICML 2020 • Chi Jin, Tiancheng Jin, Haipeng Luo, Suvrit Sra, Tiancheng Yu

We consider the task of learning in episodic finite-horizon Markov decision processes with an unknown transition function, bandit feedback, and adversarial losses.

no code implementations • 6 Jun 2024 • Jiachen Hu, Qinghua Liu, Chi Jin

Despite the remarkable success of Transformer-based architectures in various sequential modeling tasks, such as natural language processing, computer vision, and robotics, their ability to learn basic sequential models, like Hidden Markov Models (HMMs), is still unclear.

no code implementations • 6 Jun 2024 • Jiawei Ge, Yuanhao Wang, Wenzhe Li, Chi Jin

Multiplayer games, when the number of players exceeds two, present unique challenges that fundamentally distinguish them from the extensively studied two-player zero-sum games.

no code implementations • 4 Jun 2024 • Wenzhe Li, Zihan Ding, Seth Karten, Chi Jin

Recent advances in reinforcement learning (RL) heavily rely on a variety of well-designed benchmarks, which provide environmental platforms and consistent criteria to evaluate existing and novel algorithms.

Multi-agent Reinforcement Learning
reinforcement-learning
**+1**

no code implementations • 12 Feb 2024 • Ahmed Khaled, Chi Jin

For the task of finding a stationary point of a smooth and potentially nonconvex function, we give a variant of SGD that matches the best-known high-probability convergence rate for tuned SGD at only an additional polylogarithmic cost.

no code implementations • 27 Nov 2023 • Jiawei Ge, Shange Tang, Jianqing Fan, Cong Ma, Chi Jin

This paper addresses this fundamental question by proving that, surprisingly, classical Maximum Likelihood Estimation (MLE) purely using source data (without any modification) achieves the minimax optimality for covariate shift under the well-specified setting.

no code implementations • 13 Oct 2023 • Viraj Nadkarni, Jiachen Hu, Ranvir Rana, Chi Jin, Sanjeev Kulkarni, Pramod Viswanath

This ensures that the market maker balances losses to informed traders with profits from noise traders.

1 code implementation • 29 Sep 2023 • Zihan Ding, Chi Jin

We propose to apply the consistency model as an efficient yet expressive policy representation, namely consistency policy, with an actor-critic style algorithm for three typical RL settings: offline, offline-to-online and online.

no code implementations • 25 Jun 2023 • Yuanhao Wang, Qinghua Liu, Chi Jin

This paper theoretically proves that, for a wide range of preference models, we can solve preference-based RL directly using existing algorithms and techniques for reward-based RL, with small or no extra costs.

no code implementations • 22 Jun 2023 • Bowen Yi, Chi Jin, Lei Wang, Guodong Shi, Viorela Ila, Ian R. Manchester

This paper introduces a new linear parameterization to the problem of visual inertial simultaneous localization and mapping (VI-SLAM) -- without any approximation -- for the case only using information from a single monocular camera and an inertial measurement unit.

1 code implementation • NeurIPS 2023 • Ahmed Khaled, Konstantin Mishchenko, Chi Jin

This paper proposes a new easy-to-implement parameter-free gradient-based optimizer: DoWG (Distance over Weighted Gradients).

no code implementations • NeurIPS 2023 • Qinghua Liu, Gellért Weisz, András György, Chi Jin, Csaba Szepesvári

While policy optimization algorithms have played an important role in recent empirical success of Reinforcement Learning (RL), the existing theoretical understanding of policy optimization remains rather limited -- they are either restricted to tabular MDPs or suffer from highly suboptimal sample complexity, especial in online RL where exploration is necessary.

no code implementations • 10 Apr 2023 • Zihan Ding, Yuanpei Chen, Allen Z. Ren, Shixiang Shane Gu, Qianxu Wang, Hao Dong, Chi Jin

Generating human-like behavior on robots is a great challenge especially in dexterous manipulation tasks with robotic hands.

no code implementations • 2 Mar 2023 • Jiawei Ge, Shange Tang, Jianqing Fan, Chi Jin

Unsupervised pretraining, which learns a useful representation using a large amount of unlabeled data to facilitate the learning of downstream tasks, is a critical component of modern large-scale machine learning systems.

no code implementations • 13 Feb 2023 • Yuanhao Wang, Qinghua Liu, Yu Bai, Chi Jin

A unique challenge in Multi-Agent Reinforcement Learning (MARL) is the curse of multiagency, where the description length of the game as well as the complexity of many existing learning algorithms scale exponentially with the number of agents.

no code implementations • 9 Feb 2023 • Hadi Daneshmand, Jason D. Lee, Chi Jin

Particle gradient descent, which uses particles to represent a probability measure and performs gradient descent on particles in parallel, is widely used to optimize functions of probability measures.

no code implementations • 30 Oct 2022 • Chengzhuo Ni, Yuda Song, Xuezhou Zhang, Chi Jin, Mengdi Wang

To our best knowledge, this is the first sample-efficient algorithm for multi-agent general-sum Markov games that incorporates (non-linear) function approximation.

no code implementations • 27 Oct 2022 • Jiachen Hu, Han Zhong, Chi Jin, LiWei Wang

Sim-to-real transfer trains RL agents in the simulated environments and then deploys them in the real world.

no code implementations • 20 Oct 2022 • Yuanhao Wang, Dingwen Kong, Yu Bai, Chi Jin

This paper develops the first line of efficient algorithms for learning rationalizable Coarse Correlated Equilibria (CCE) and Correlated Equilibria (CE) whose sample complexities are polynomial in all problem parameters including the number of players.

no code implementations • 29 Sep 2022 • Qinghua Liu, Praneeth Netrapalli, Csaba Szepesvári, Chi Jin

We prove that OMLE learns the near-optimal policies of an enormously rich class of sequential decision making problems in a polynomial number of samples.

no code implementations • 6 Sep 2022 • Ahmed Khaled, Chi Jin

Federated learning (FL) is a subfield of machine learning where multiple clients try to collaboratively learn a model over a network under communication constraints.

2 code implementations • 18 Jul 2022 • Zihan Ding, DiJia Su, Qinghua Liu, Chi Jin

This paper proposes new, end-to-end deep reinforcement learning algorithms for learning two-player zero-sum Markov games.

no code implementations • 2 Jun 2022 • Qinghua Liu, Csaba Szepesvári, Chi Jin

This paper considers the challenging tasks of Multi-Agent Reinforcement Learning (MARL) under partial observability, where each agent only sees her own individual observations and actions that reveal incomplete information about the underlying state of system.

Multi-agent Reinforcement Learning
reinforcement-learning
**+1**

no code implementations • 30 May 2022 • Yu Bai, Chi Jin, Song Mei, Ziang Song, Tiancheng Yu

A conceptually appealing approach for learning Extensive-Form Games (EFGs) is to convert them to Normal-Form Games (NFGs).

no code implementations • 19 Apr 2022 • Qinghua Liu, Alan Chung, Csaba Szepesvári, Chi Jin

Applications of Reinforcement Learning (RL), in which agents learn to make a sequence of decisions despite lacking complete information about the latent states of the controlled system, that is, they act under partial observability of the states, are ubiquitous.

Partially Observable Reinforcement Learning
reinforcement-learning
**+1**

no code implementations • 14 Mar 2022 • Qinghua Liu, Yuanhao Wang, Chi Jin

When the policies of the opponents are not revealed, we prove a statistical hardness result even in the most favorable scenario when both above conditions are true.

no code implementations • 8 Feb 2022 • Yonathan Efroni, Chi Jin, Akshay Krishnamurthy, Sobhan Miryoosefi

Real-world sequential decision making problems commonly involve partial observability, which requires the agent to maintain a memory of history in order to infer the latent states, plan and make good decisions.

no code implementations • 3 Feb 2022 • Yu Bai, Chi Jin, Song Mei, Tiancheng Yu

This improves upon the best known sample complexity of $\widetilde{\mathcal{O}}((X^2A+Y^2B)/\varepsilon^2)$ by a factor of $\widetilde{\mathcal{O}}(\max\{X, Y\})$, and matches the information-theoretic lower bound up to logarithmic factors.

no code implementations • 23 Dec 2021 • Bowen Yi, Chi Jin, Ian R. Manchester

The design of a globally convergent position observer for feature points from visual information is a challenging problem, especially for the case with only inertial measurements and without assumptions of uniform observability, which remained open for a long time.

no code implementations • 27 Oct 2021 • Chi Jin, Qinghua Liu, Yuanhao Wang, Tiancheng Yu

We design a new class of fully decentralized algorithms -- V-learning, which provably learns Nash equilibria (in the two-player zero-sum setting), correlated equilibria and coarse correlated equilibria (in the multiplayer general-sum setting) in a number of samples that only scales with $\max_{i\in[m]} A_i$, where $A_i$ is the number of actions for the $i^{\rm th}$ player.

no code implementations • ICLR 2022 • Xiaoyu Chen, Jiachen Hu, Chi Jin, Lihong Li, LiWei Wang

Reinforcement learning encounters many challenges when applied directly in the real world.

no code implementations • 12 Jul 2021 • Sobhan Miryoosefi, Chi Jin

In constrained reinforcement learning (RL), a learning agent seeks to not only optimize the overall reward but also satisfy the additional safety, diversity, or budget constraints.

no code implementations • 7 Jun 2021 • Chi Jin, Qinghua Liu, Tiancheng Yu

Modern reinforcement learning (RL) commonly engages practical problems with large state spaces, where function approximation must be deployed to approximate either the value function or the policy.

1 code implementation • ICLR 2022 • Tanner Fiez, Chi Jin, Praneeth Netrapalli, Lillian J. Ratliff

This paper considers minimax optimization $\min_x \max_y f(x, y)$ in the challenging setting where $f$ can be both nonconvex in $x$ and nonconcave in $y$.

no code implementations • 25 Mar 2021 • Yaqi Duan, Chi Jin, Zhiyuan Li

Concretely, we view the Bellman error as a surrogate loss for the optimality gap, and prove the followings: (1) In double sampling regime, the excess risk of Empirical Risk Minimizer (ERM) is bounded by the Rademacher complexity of the function class.

no code implementations • NeurIPS 2021 • Yu Bai, Chi Jin, Huan Wang, Caiming Xiong

Real world applications such as economics and policy making often involve solving multi-agent games with two unique features: (1) The agents are inherently asymmetric and partitioned into leaders and followers; (2) The agents have different reward functions, thus the game is general-sum.

no code implementations • 8 Feb 2021 • Jiachen Hu, Xiaoyu Chen, Chi Jin, Lihong Li, LiWei Wang

This paper studies representation learning for multi-task linear bandits and multi-task episodic RL with linear value function approximation.

no code implementations • 4 Feb 2021 • Mo Zhou, Rong Ge, Chi Jin

We show that as long as the loss is already lower than a threshold (polynomial in relevant parameters), all student neurons in an over-parameterized two-layer neural network will converge to one of teacher neurons, and the loss will go to 0.

no code implementations • NeurIPS 2021 • Chi Jin, Qinghua Liu, Sobhan Miryoosefi

Finding the minimal structural assumptions that empower sample-efficient learning is one of the most important research directions in Reinforcement Learning (RL).

no code implementations • ICLR 2021 • Dipendra Misra, Qinghua Liu, Chi Jin, John Langford

We propose a novel setting for reinforcement learning that combines two common real-world difficulties: presence of observations (such as camera images) and factored states (such as location of objects).

no code implementations • NeurIPS 2020 • Zhuoran Yang, Chi Jin, Zhaoran Wang, Mengdi Wang, Michael Jordan

Reinforcement learning (RL) algorithms combined with modern function approximators such as kernel functions and deep neural networks have achieved significant empirical successes in large-scale application problems with a massive number of states.

no code implementations • 9 Nov 2020 • Zhuoran Yang, Chi Jin, Zhaoran Wang, Mengdi Wang, Michael I. Jordan

The classical theory of reinforcement learning (RL) has focused on tabular and linear representations of value functions.

no code implementations • 4 Oct 2020 • Qinghua Liu, Tiancheng Yu, Yu Bai, Chi Jin

However, for multi-agent reinforcement learning in Markov games, the current best known sample complexity for model-based algorithms is rather suboptimal and compares unfavorably against recent model-free approaches.

Model-based Reinforcement Learning
Multi-agent Reinforcement Learning
**+2**

no code implementations • NeurIPS 2020 • Yu Bai, Chi Jin, Tiancheng Yu

This paper considers the problem of designing optimal algorithms for reinforcement learning in two-player zero-sum games.

no code implementations • NeurIPS 2020 • Chi Jin, Sham M. Kakade, Akshay Krishnamurthy, Qinghua Liu

Partial observability is a common challenge in many reinforcement learning applications, which requires an agent to maintain memory, infer latent states, and integrate this past information into exploration.

no code implementations • NeurIPS 2020 • Nilesh Tripuraneni, Michael. I. Jordan, Chi Jin

Formally, we consider $t+1$ tasks parameterized by functions of the form $f_j \circ h$ in a general function class $\mathcal{F} \circ \mathcal{H}$, where each $f_j$ is a task-specific function in $\mathcal{F}$ and $h$ is the shared representation in $\mathcal{H}$.

1 code implementation • 26 Feb 2020 • Nilesh Tripuraneni, Chi Jin, Michael. I. Jordan

In this paper, we focus on the problem of multi-task linear regression -- in which multiple linear regression models share a common, low-dimensional linear representation.

no code implementations • ICML 2020 • Yu Bai, Chi Jin

We introduce a self-play algorithm---Value Iteration with Upper/Lower Confidence Bound (VI-ULCB)---and show that it achieves regret $\tilde{\mathcal{O}}(\sqrt{T})$ after playing $T$ steps of the game, where the regret is measured by the agent's performance against a \emph{fully adversarial} opponent who can exploit the agent's strategy at \emph{any} step.

no code implementations • ICML 2020 • Chi Jin, Akshay Krishnamurthy, Max Simchowitz, Tiancheng Yu

We give an efficient algorithm that conducts $\tilde{\mathcal{O}}(S^2A\mathrm{poly}(H)/\epsilon^2)$ episodes of exploration and returns $\epsilon$-suboptimal policies for an arbitrary number of reward functions.

no code implementations • 5 Feb 2020 • Tianyi Lin, Chi Jin, Michael. I. Jordan

This paper presents the first algorithm with $\tilde{O}(\sqrt{\kappa_{\mathbf x}\kappa_{\mathbf y}})$ gradient complexity, matching the lower bound up to logarithmic factors.

no code implementations • ICML 2020 • Qi Cai, Zhuoran Yang, Chi Jin, Zhaoran Wang

While policy-based reinforcement learning (RL) achieves tremendous successes in practice, it is significantly less understood in theory, especially compared with value-based RL.

no code implementations • 3 Dec 2019 • Chi Jin, Tiancheng Jin, Haipeng Luo, Suvrit Sra, Tiancheng Yu

We consider the problem of learning in episodic finite-horizon Markov decision processes with an unknown transition function, bandit feedback, and adversarial losses.

2 code implementations • 11 Jul 2019 • Chi Jin, Zhuoran Yang, Zhaoran Wang, Michael. I. Jordan

Modern Reinforcement Learning (RL) is commonly applied to practical problems with an enormous number of states, where function approximation must be deployed to approximate either the value function or the policy.

no code implementations • ICML 2020 • Tianyi Lin, Chi Jin, Michael. I. Jordan

We consider nonconvex-concave minimax problems, $\min_{\mathbf{x}} \max_{\mathbf{y} \in \mathcal{Y}} f(\mathbf{x}, \mathbf{y})$, where $f$ is nonconvex in $\mathbf{x}$ but concave in $\mathbf{y}$ and $\mathcal{Y}$ is a convex and bounded set.

no code implementations • 13 Feb 2019 • Chi Jin, Praneeth Netrapalli, Rong Ge, Sham M. Kakade, Michael. I. Jordan

More recent theory has shown that GD and SGD can avoid saddle points, but the dependence on dimension in these analyses is polynomial.

no code implementations • 11 Feb 2019 • Chi Jin, Praneeth Netrapalli, Rong Ge, Sham M. Kakade, Michael. I. Jordan

In this note, we derive concentration inequalities for random vectors with subGaussian norm (a generalization of both subGaussian random vectors and norm bounded random vectors), which are tight up to logarithmic factors.

1 code implementation • ICML 2020 • Chi Jin, Praneeth Netrapalli, Michael. I. Jordan

Minimax optimization has found extensive applications in modern machine learning, in settings such as generative adversarial networks (GANs), adversarial training and multi-agent reinforcement learning.

BIG-bench Machine Learning Multi-agent Reinforcement Learning

no code implementations • 20 Nov 2018 • Yi-An Ma, Yuansi Chen, Chi Jin, Nicolas Flammarion, Michael. I. Jordan

Optimization algorithms and Monte Carlo sampling algorithms have provided the computational foundations for the rapid growth in applications of statistical machine learning in recent years.

1 code implementation • NeurIPS 2018 • Chi Jin, Zeyuan Allen-Zhu, Sebastien Bubeck, Michael. I. Jordan

We prove that, in an episodic MDP setting, Q-learning with UCB exploration achieves regret $\tilde{O}(\sqrt{H^3 SAT})$, where $S$ and $A$ are the numbers of states and actions, $H$ is the number of steps per episode, and $T$ is the total number of steps.

no code implementations • 4 Apr 2018 • Yuansi Chen, Chi Jin, Bin Yu

Applying existing stability upper bounds for the gradient methods in our trade-off framework, we obtain lower bounds matching the well-established convergence upper bounds up to constants for these algorithms and conjecture similar lower bounds for NAG and HB.

no code implementations • NeurIPS 2018 • Chi Jin, Lydia T. Liu, Rong Ge, Michael. I. Jordan

Our objective is to find the $\epsilon$-approximate local minima of the underlying function $F$ while avoiding the shallow local minima---arising because of the tolerance $\nu$---which exist only in $f$.

no code implementations • 28 Nov 2017 • Chi Jin, Praneeth Netrapalli, Michael. I. Jordan

Nesterov's accelerated gradient descent (AGD), an instance of the general family of "momentum methods", provably achieves faster convergence rate than gradient descent (GD) in the convex setting.

no code implementations • NeurIPS 2018 • Nilesh Tripuraneni, Mitchell Stern, Chi Jin, Jeffrey Regier, Michael. I. Jordan

This paper proposes a stochastic variant of a classic algorithm---the cubic-regularized Newton method [Nesterov and Polyak 2006].

no code implementations • NeurIPS 2017 • Simon S. Du, Chi Jin, Jason D. Lee, Michael. I. Jordan, Barnabas Poczos, Aarti Singh

Although gradient descent (GD) almost always escapes saddle points asymptotically [Lee et al., 2016], this paper shows that even with fairly natural random initialization schemes and non-pathological functions, GD can be significantly slowed down by saddle points, taking exponential time to escape.

no code implementations • ICML 2017 • Rong Ge, Chi Jin, Yi Zheng

In this paper we develop a new framework that captures the common landscape underlying the common non-convex low-rank matrix problems including matrix sensing, matrix completion and robust PCA.

no code implementations • ICML 2017 • Chi Jin, Rong Ge, Praneeth Netrapalli, Sham M. Kakade, Michael. I. Jordan

This paper shows that a perturbed form of gradient descent converges to a second-order stationary point in a number iterations which depends only poly-logarithmically on dimension (i. e., it is almost "dimension-free").

no code implementations • NeurIPS 2016 • Chi Jin, Yuchen Zhang, Sivaraman Balakrishnan, Martin J. Wainwright, Michael Jordan

Our first main result shows that the population likelihood function has bad local maxima even in the special case of equally-weighted mixtures of well-separated and spherical Gaussians.

no code implementations • 26 May 2016 • Dan Garber, Elad Hazan, Chi Jin, Sham M. Kakade, Cameron Musco, Praneeth Netrapalli, Aaron Sidford

We give faster algorithms and improved sample complexities for estimating the top eigenvector of a matrix $\Sigma$ -- i. e. computing a unit vector $x$ such that $x^T \Sigma x \ge (1-\epsilon)\lambda_1(\Sigma)$: Offline Eigenvector Estimation: Given an explicit $A \in \mathbb{R}^{n \times d}$ with $\Sigma = A^TA$, we show how to compute an $\epsilon$ approximate top eigenvector in time $\tilde O([nnz(A) + \frac{d*sr(A)}{gap^2} ]* \log 1/\epsilon )$ and $\tilde O([\frac{nnz(A)^{3/4} (d*sr(A))^{1/4}}{\sqrt{gap}} ] * \log 1/\epsilon )$.

no code implementations • NeurIPS 2016 • Chi Jin, Sham M. Kakade, Praneeth Netrapalli

While existing algorithms are efficient for the offline setting, they could be highly inefficient for the online setting.

no code implementations • 13 Apr 2016 • Rong Ge, Chi Jin, Sham M. Kakade, Praneeth Netrapalli, Aaron Sidford

Our algorithm is linear in the input size and the number of components $k$ up to a $\log(k)$ factor.

no code implementations • 22 Feb 2016 • Prateek Jain, Chi Jin, Sham M. Kakade, Praneeth Netrapalli, Aaron Sidford

This work provides improved guarantees for streaming principle component analysis (PCA).

no code implementations • 29 Oct 2015 • Chi Jin, Sham M. Kakade, Cameron Musco, Praneeth Netrapalli, Aaron Sidford

Combining our algorithm with previous work to initialize $x_0$, we obtain a number of improved sample complexity and runtime results.

1 code implementation • 6 Mar 2015 • Rong Ge, Furong Huang, Chi Jin, Yang Yuan

To the best of our knowledge this is the first work that gives global convergence guarantees for stochastic gradient descent on non-convex functions with exponentially many local minima and saddle points.

no code implementations • 6 Jan 2014 • Chi Jin, Ziteng Wang, Junliang Huang, Yiqiao Zhong, Li-Wei Wang

We develop an $\epsilon$-differentially private mechanism for the class of $K$-smooth queries.

no code implementations • NeurIPS 2012 • Chi Jin, Li-Wei Wang

We show that our bound is strictly sharper than a previously well-known PAC-Bayes margin bound if the feature space is of finite dimension; and the two bounds tend to be equivalent as the dimension goes to infinity.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.