Search Results for author: Qiwen Cui

Found 16 papers, 3 papers with code

Learning Optimal Tax Design in Nonatomic Congestion Games

no code implementations12 Feb 2024 Qiwen Cui, Maryam Fazel, Simon S. Du

We study how to learn the optimal tax design to maximize the efficiency in nonatomic congestion games.

Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning

1 code implementation30 Oct 2023 Zhaoyi Zhou, Chuning Zhu, Runlong Zhou, Qiwen Cui, Abhishek Gupta, Simon Shaolei Du

Off-policy dynamic programming (DP) techniques such as $Q$-learning have proven to be important in sequential decision-making problems.

Decision Making Offline RL +1

A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning

no code implementations12 Jun 2023 Haozhe Jiang, Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon S. Du

Specifically, we focus on games with bandit feedback, where testing an equilibrium can result in substantial regret even when the gap to be tested is small, and the existence of multiple optimal solutions (equilibria) in stationary games poses extra challenges.

Multi-agent Reinforcement Learning reinforcement-learning

Breaking the Curse of Multiagents in a Large State Space: RL in Markov Games with Independent Linear Function Approximation

no code implementations7 Feb 2023 Qiwen Cui, Kaiqing Zhang, Simon S. Du

In contrast, existing works for Markov games with function approximation have sample complexity bounds scale with the size of the \emph{joint action space} when specialized to the canonical tabular Markov game setting, which is exponentially large in the number of agents.

Multi-agent Reinforcement Learning

Offline congestion games: How feedback type affects data coverage requirement

no code implementations24 Oct 2022 Haozhe Jiang, Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon S. Du

Starting from the facility-level (a. k. a., semi-bandit) feedback, we propose a novel one-unit deviation coverage condition and give a pessimism-type algorithm that can recover an approximate NE.

Vocal Bursts Type Prediction

Learning in Congestion Games with Bandit Feedback

no code implementations4 Jun 2022 Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon S. Du

We propose a centralized algorithm for Markov congestion games, whose sample complexity again has only polynomial dependence on all relevant problem parameters, but not the size of the action set.

Provably Efficient Offline Multi-agent Reinforcement Learning via Strategy-wise Bonus

no code implementations1 Jun 2022 Qiwen Cui, Simon S. Du

Furthermore, for offline multi-agent general-sum Markov games, based on the strategy-wise bonus and a novel surrogate function, we give the first algorithm whose sample complexity only scales $\sum_{i=1}^mA_i$ where $A_i$ is the action size of the $i$-th player and $m$ is the number of players.

Multi-agent Reinforcement Learning reinforcement-learning +1

On Gap-dependent Bounds for Offline Reinforcement Learning

no code implementations1 Jun 2022 Xinqi Wang, Qiwen Cui, Simon S. Du

This paper presents a systematic study on gap-dependent sample complexity in offline reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Randomized Exploration for Reinforcement Learning with General Value Function Approximation

1 code implementation15 Jun 2021 Haque Ishfaq, Qiwen Cui, Viet Nguyen, Alex Ayoub, Zhuoran Yang, Zhaoran Wang, Doina Precup, Lin F. Yang

We propose a model-free reinforcement learning algorithm inspired by the popular randomized least squares value iteration (RLSVI) algorithm as well as the optimism principle.

reinforcement-learning Reinforcement Learning (RL)

Near-Optimal Randomized Exploration for Tabular Markov Decision Processes

no code implementations19 Feb 2021 Zhihan Xiong, Ruoqi Shen, Qiwen Cui, Maryam Fazel, Simon S. Du

To achieve the desired result, we develop 1) a new clipping operation to ensure both the probability of being optimistic and the probability of being pessimistic are lower bounded by a constant, and 2) a new recursive formula for the absolute value of estimation errors to analyze the regret.

Minimax Sample Complexity for Turn-based Stochastic Game

no code implementations29 Nov 2020 Qiwen Cui, Lin F. Yang

The empirical success of Multi-agent reinforcement learning is encouraging, while few theoretical guarantees have been revealed.

Multi-agent Reinforcement Learning reinforcement-learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.