no code implementations • 30 Sep 2024 • Laixi Shi, Jingchu Gai, Eric Mazumdar, Yuejie Chi, Adam Wierman
A notorious yet open challenge is if RMGs can escape the curse of multiagency, where the sample complexity scales exponentially with the number of agents.
no code implementations • 15 Jul 2024 • Haohong Lin, Wenhao Ding, Jian Chen, Laixi Shi, Jiacheng Zhu, Bo Li, Ding Zhao
Offline model-based reinforcement learning (MBRL) enhances data efficiency by utilizing pre-collected datasets to learn models and policies, especially in scenarios where exploration is costly or infeasible.
no code implementations • 22 Jun 2024 • Zhengfei Zhang, Kishan Panaganti, Laixi Shi, Yanan Sui, Adam Wierman, Yisong Yue
We study the problem of Distributionally Robust Constrained RL (DRC-RL), where the goal is to maximize the expected reward subject to environmental distribution shifts and constraints.
no code implementations • 20 Jun 2024 • Eric Mazumdar, Kishan Panaganti, Laixi Shi
To overcome this obstacle, we take inspiration from behavioral economics and show that -- by imbuing agents with important features of human decision-making like risk aversion and bounded rationality -- a class of risk-averse quantal response equilibria (RQE) become tractable to compute in all $n$-player matrix and finite-horizon Markov games.
no code implementations • 31 May 2024 • Shangding Gu, Laixi Shi, Yuhao Ding, Alois Knoll, Costas Spanos, Adam Wierman, Ming Jin
Safe reinforcement learning (RL) is crucial for deploying RL agents in real-world applications, as it aims to maximize long-term rewards while satisfying safety constraints.
no code implementations • 29 Apr 2024 • Laixi Shi, Eric Mazumdar, Yuejie Chi, Adam Wierman
To overcome the sim-to-real gap in reinforcement learning (RL), learned policies must maintain robustness against environmental uncertainties.
Multi-agent Reinforcement Learning Reinforcement Learning (RL)
no code implementations • 19 Mar 2024 • He Wang, Laixi Shi, Yuejie Chi
In offline reinforcement learning (RL), the absence of active exploration calls for attention on the model robustness to tackle the sim-to-real gap, where the discrepancy between the simulated and deployed environments can significantly undermine the performance of the learned policy.
no code implementations • 8 Feb 2024 • Jiin Woo, Laixi Shi, Gauri Joshi, Yuejie Chi
Our sample complexity analysis reveals that, with appropriately chosen parameters and synchronization schedules, FedLCB-Q achieves linear speedup in terms of the number of agents without requiring high-quality datasets at individual agents, as long as the local datasets collectively cover the state-action space visited by the optimal policy, highlighting the power of collaboration in the federated setting.
no code implementations • 25 Jul 2023 • Laixi Shi, Robert Dadashi, Yuejie Chi, Pablo Samuel Castro, Matthieu Geist
In this work, we propose to regularize towards the Q-function of the behavior policy instead of the behavior policy itself, under the premise that the Q-function can be estimated more reliably and easily by a SARSA-style estimate and handles the extrapolation error more straightforwardly.
no code implementations • NeurIPS 2023 • Laixi Shi, Gen Li, Yuting Wei, Yuxin Chen, Matthieu Geist, Yuejie Chi
Assuming access to a generative model that draws samples based on the nominal MDP, we characterize the sample complexity of RMDPs when the uncertainty set is specified via either the total variation (TV) distance or $\chi^2$ divergence.
1 code implementation • 18 Oct 2022 • Peide Huang, Mengdi Xu, Jiacheng Zhu, Laixi Shi, Fei Fang, Ding Zhao
Curriculum Reinforcement Learning (CRL) aims to create a sequence of tasks, starting from easy ones and gradually learning towards difficult tasks.
no code implementations • 11 Aug 2022 • Laixi Shi, Yuejie Chi
This paper concerns the central issues of model robustness and sample efficiency in offline reinforcement learning (RL), which aims to learn to perform decision making from history data without active exploration.
no code implementations • 11 Apr 2022 • Gen Li, Laixi Shi, Yuxin Chen, Yuejie Chi, Yuting Wei
We demonstrate that the model-based (or "plug-in") approach achieves minimax-optimal sample complexity without burn-in cost for tabular Markov decision processes (MDPs).
no code implementations • 28 Feb 2022 • Laixi Shi, Gen Li, Yuting Wei, Yuxin Chen, Yuejie Chi
Offline or batch reinforcement learning seeks to learn a near-optimal policy using history data without active exploration of the environment.
no code implementations • NeurIPS 2021 • Gen Li, Laixi Shi, Yuxin Chen, Yuejie Chi
Achieving sample efficiency in online episodic reinforcement learning (RL) requires optimally balancing exploration and exploitation.
no code implementations • 25 Nov 2019 • Laixi Shi, Yuejie Chi
Multi-channel sparse blind deconvolution, or convolutional sparse coding, refers to the problem of learning an unknown filter by observing its circulant convolutions with multiple input signals that are sparse.
no code implementations • 1 Dec 2017 • Yu Sang, Laixi Shi, Yimin Liu
In this paper, we propose a micro hand gesture recognition system and methods using ultrasonic active sensing.