no code implementations • 19 Mar 2024 • He Wang, Laixi Shi, Yuejie Chi
In offline reinforcement learning (RL), the absence of active exploration calls for attention on the model robustness to tackle the sim-to-real gap, where the discrepancy between the simulated and deployed environments can significantly undermine the performance of the learned policy.
no code implementations • 8 Feb 2024 • Jiin Woo, Laixi Shi, Gauri Joshi, Yuejie Chi
Our sample complexity analysis reveals that, with appropriately chosen parameters and synchronization schedules, FedLCB-Q achieves linear speedup in terms of the number of agents without requiring high-quality datasets at individual agents, as long as the local datasets collectively cover the state-action space visited by the optimal policy, highlighting the power of collaboration in the federated setting.
no code implementations • 25 Jul 2023 • Laixi Shi, Robert Dadashi, Yuejie Chi, Pablo Samuel Castro, Matthieu Geist
In this work, we propose to regularize towards the Q-function of the behavior policy instead of the behavior policy itself, under the premise that the Q-function can be estimated more reliably and easily by a SARSA-style estimate and handles the extrapolation error more straightforwardly.
no code implementations • NeurIPS 2023 • Laixi Shi, Gen Li, Yuting Wei, Yuxin Chen, Matthieu Geist, Yuejie Chi
Assuming access to a generative model that draws samples based on the nominal MDP, we characterize the sample complexity of RMDPs when the uncertainty set is specified via either the total variation (TV) distance or $\chi^2$ divergence.
1 code implementation • 18 Oct 2022 • Peide Huang, Mengdi Xu, Jiacheng Zhu, Laixi Shi, Fei Fang, Ding Zhao
Curriculum Reinforcement Learning (CRL) aims to create a sequence of tasks, starting from easy ones and gradually learning towards difficult tasks.
no code implementations • 11 Aug 2022 • Laixi Shi, Yuejie Chi
This paper concerns the central issues of model robustness and sample efficiency in offline reinforcement learning (RL), which aims to learn to perform decision making from history data without active exploration.
no code implementations • 11 Apr 2022 • Gen Li, Laixi Shi, Yuxin Chen, Yuejie Chi, Yuting Wei
We demonstrate that the model-based (or "plug-in") approach achieves minimax-optimal sample complexity without burn-in cost for tabular Markov decision processes (MDPs).
no code implementations • 28 Feb 2022 • Laixi Shi, Gen Li, Yuting Wei, Yuxin Chen, Yuejie Chi
Offline or batch reinforcement learning seeks to learn a near-optimal policy using history data without active exploration of the environment.
no code implementations • NeurIPS 2021 • Gen Li, Laixi Shi, Yuxin Chen, Yuejie Chi
Achieving sample efficiency in online episodic reinforcement learning (RL) requires optimally balancing exploration and exploitation.
no code implementations • 25 Nov 2019 • Laixi Shi, Yuejie Chi
Multi-channel sparse blind deconvolution, or convolutional sparse coding, refers to the problem of learning an unknown filter by observing its circulant convolutions with multiple input signals that are sparse.
no code implementations • 1 Dec 2017 • Yu Sang, Laixi Shi, Yimin Liu
In this paper, we propose a micro hand gesture recognition system and methods using ultrasonic active sensing.