Search Results for author: Qinghua Liu

Found 20 papers, 4 papers with code

Is RLHF More Difficult than Standard RL?

no code implementations25 Jun 2023 Yuanhao Wang, Qinghua Liu, Chi Jin

This paper theoretically proves that, for a wide range of preference models, we can solve preference-based RL directly using existing algorithms and techniques for reward-based RL, with small or no extra costs.

reinforcement-learning Reinforcement Learning (RL)

Optimistic Natural Policy Gradient: a Simple Efficient Policy Optimization Framework for Online RL

no code implementations NeurIPS 2023 Qinghua Liu, Gellért Weisz, András György, Chi Jin, Csaba Szepesvári

While policy optimization algorithms have played an important role in recent empirical success of Reinforcement Learning (RL), the existing theoretical understanding of policy optimization remains rather limited -- they are either restricted to tabular MDPs or suffer from highly suboptimal sample complexity, especial in online RL where exploration is necessary.

Reinforcement Learning (RL)

Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function Approximation

no code implementations13 Feb 2023 Yuanhao Wang, Qinghua Liu, Yu Bai, Chi Jin

A unique challenge in Multi-Agent Reinforcement Learning (MARL) is the curse of multiagency, where the description length of the game as well as the complexity of many existing learning algorithms scale exponentially with the number of agents.

Multi-agent Reinforcement Learning

Optimistic MLE -- A Generic Model-based Algorithm for Partially Observable Sequential Decision Making

no code implementations29 Sep 2022 Qinghua Liu, Praneeth Netrapalli, Csaba Szepesvári, Chi Jin

We prove that OMLE learns the near-optimal policies of an enormously rich class of sequential decision making problems in a polynomial number of samples.

Decision Making Model-based Reinforcement Learning +1

Dive into Big Model Training

1 code implementation25 Jul 2022 Qinghua Liu, Yuxiang Jiang

We summarize the existing training methodologies into three main categories: training parallelism, memory-saving technologies, and model sparsity design.

Self-Supervised Learning

A Deep Reinforcement Learning Approach for Finding Non-Exploitable Strategies in Two-Player Atari Games

2 code implementations18 Jul 2022 Zihan Ding, DiJia Su, Qinghua Liu, Chi Jin

This paper proposes new, end-to-end deep reinforcement learning algorithms for learning two-player zero-sum Markov games.

Atari Games Q-Learning

Policy Optimization for Markov Games: Unified Framework and Faster Convergence

no code implementations6 Jun 2022 Runyu Zhang, Qinghua Liu, Huan Wang, Caiming Xiong, Na Li, Yu Bai

Next, we show that this framework instantiated with the Optimistic Follow-The-Regularized-Leader (OFTRL) algorithm at each state (and smooth value updates) can find an $\mathcal{\widetilde{O}}(T^{-5/6})$ approximate NE in $T$ iterations, and a similar algorithm with slightly modified value update rule achieves a faster $\mathcal{\widetilde{O}}(T^{-1})$ convergence rate.

Multi-agent Reinforcement Learning

Sample-Efficient Reinforcement Learning of Partially Observable Markov Games

no code implementations2 Jun 2022 Qinghua Liu, Csaba Szepesvári, Chi Jin

This paper considers the challenging tasks of Multi-Agent Reinforcement Learning (MARL) under partial observability, where each agent only sees her own individual observations and actions that reveal incomplete information about the underlying state of system.

Multi-agent Reinforcement Learning reinforcement-learning +1

When Is Partially Observable Reinforcement Learning Not Scary?

no code implementations19 Apr 2022 Qinghua Liu, Alan Chung, Csaba Szepesvári, Chi Jin

Applications of Reinforcement Learning (RL), in which agents learn to make a sequence of decisions despite lacking complete information about the latent states of the controlled system, that is, they act under partial observability of the states, are ubiquitous.

Partially Observable Reinforcement Learning reinforcement-learning +1

Learning Markov Games with Adversarial Opponents: Efficient Algorithms and Fundamental Limits

no code implementations14 Mar 2022 Qinghua Liu, Yuanhao Wang, Chi Jin

When the policies of the opponents are not revealed, we prove a statistical hardness result even in the most favorable scenario when both above conditions are true.

LiMuSE: Lightweight Multi-modal Speaker Extraction

1 code implementation7 Nov 2021 Qinghua Liu, Yating Huang, Yunzhe Hao, Jiaming Xu, Bo Xu

Multi-modal cues, including spatial information, facial expression and voiceprint, are introduced to the speech separation and speaker extraction tasks to serve as complementary information to achieve better performance.

Model Compression Quantization +1

V-Learning -- A Simple, Efficient, Decentralized Algorithm for Multiagent RL

no code implementations27 Oct 2021 Chi Jin, Qinghua Liu, Yuanhao Wang, Tiancheng Yu

We design a new class of fully decentralized algorithms -- V-learning, which provably learns Nash equilibria (in the two-player zero-sum setting), correlated equilibria and coarse correlated equilibria (in the multiplayer general-sum setting) in a number of samples that only scales with $\max_{i\in[m]} A_i$, where $A_i$ is the number of actions for the $i^{\rm th}$ player.

Medical Visual Question Answering Q-Learning

The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces

no code implementations7 Jun 2021 Chi Jin, Qinghua Liu, Tiancheng Yu

Modern reinforcement learning (RL) commonly engages practical problems with large state spaces, where function approximation must be deployed to approximate either the value function or the policy.

Reinforcement Learning (RL)

Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms

no code implementations NeurIPS 2021 Chi Jin, Qinghua Liu, Sobhan Miryoosefi

Finding the minimal structural assumptions that empower sample-efficient learning is one of the most important research directions in Reinforcement Learning (RL).

Reinforcement Learning (RL)

Provable Rich Observation Reinforcement Learning with Combinatorial Latent States

no code implementations ICLR 2021 Dipendra Misra, Qinghua Liu, Chi Jin, John Langford

We propose a novel setting for reinforcement learning that combines two common real-world difficulties: presence of observations (such as camera images) and factored states (such as location of objects).

Contrastive Learning reinforcement-learning +1

A Tight Lower Bound for Uniformly Stable Algorithms

no code implementations24 Dec 2020 Qinghua Liu, Zhou Lu

In this paper we fill the gap by proving a tight generalization lower bound of order $\Omega(\gamma+\frac{L}{\sqrt{n}})$, which matches the best known upper bound up to logarithmic factors

Generalization Bounds Learning Theory

A Sharp Analysis of Model-based Reinforcement Learning with Self-Play

no code implementations4 Oct 2020 Qinghua Liu, Tiancheng Yu, Yu Bai, Chi Jin

However, for multi-agent reinforcement learning in Markov games, the current best known sample complexity for model-based algorithms is rather suboptimal and compares unfavorably against recent model-free approaches.

Model-based Reinforcement Learning Multi-agent Reinforcement Learning +2

Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization

1 code implementation NeurIPS 2020 Jianyu Wang, Qinghua Liu, Hao Liang, Gauri Joshi, H. Vincent Poor

In federated optimization, heterogeneity in the clients' local datasets and computation speeds results in large variations in the number of local updates performed by each client in each communication round.

Sample-Efficient Reinforcement Learning of Undercomplete POMDPs

no code implementations NeurIPS 2020 Chi Jin, Sham M. Kakade, Akshay Krishnamurthy, Qinghua Liu

Partial observability is a common challenge in many reinforcement learning applications, which requires an agent to maintain memory, infer latent states, and integrate this past information into exploration.

reinforcement-learning Reinforcement Learning (RL)

Rigorous Restricted Isometry Property of Low-Dimensional Subspaces

no code implementations30 Jan 2018 Gen Li, Qinghua Liu, Yuantao Gu

As an analogy to JL Lemma and RIP for sparse vectors, this work allows the use of random projections to reduce the ambient dimension with the theoretical guarantee that the distance between subspaces after compression is well preserved.

Dimensionality Reduction LEMMA

Cannot find the paper you are looking for? You can Submit a new open access paper.