no code implementations • 16 Mar 2024 • Ziping Xu, Kelly W. Zhang, Susan A. Murphy
In the realm of Reinforcement Learning (RL), online RL is often conceptualized as an optimization problem, where an algorithm interacts with an unknown environment to minimize cumulative regret.
no code implementations • 7 Mar 2024 • Kevin Tan, Ziping Xu
Hybrid Reinforcement Learning (RL), leveraging both online and offline data, has garnered recent interest, yet research on its provable benefits remains sparse.
no code implementations • 3 Mar 2024 • Ziping Xu, Zifan Xu, Runxuan Jiang, Peter Stone, Ambuj Tewari
Multitask Reinforcement Learning (MTRL) approaches have gained increasing attention for its wide applications in many important Reinforcement Learning (RL) tasks.
no code implementations • 26 Jul 2023 • Yongyi Guo, Ziping Xu, Susan Murphy
When the context error is non-vanishing, classical bandit algorithms fail to achieve sublinear regret.
no code implementations • 30 May 2022 • Ziping Xu, Eunjae Shim, Ambuj Tewari, Paul Zimmerman
Starting with a large unlabeled dataset, algorithms for ASD adaptively label the points with the goal to maximize the sum of responses.
no code implementations • 13 Nov 2021 • Ziping Xu, Ambuj Tewari
For both settings, we derive the minimax rates for CL with the oracle that provides the optimal curriculum and without the oracle, where the agent has to adaptively learn a good curriculum.
no code implementations • 10 Aug 2021 • Yangyi Lu, Ziping Xu, Ambuj Tewari
However, the modern precision medicine movement has been enabled by a confluence of events: scientific advances in fields such as genetics and pharmacology, technological advances in mobile devices and wearable sensors, and methodological advances in computing and data sciences.
no code implementations • 9 Jul 2021 • Hao Sun, Ziping Xu, Meng Fang, Zhenghao Peng, Jiadong Guo, Bo Dai, Bolei Zhou
Safe exploration is crucial for the real-world application of reinforcement learning (RL).
no code implementations • NeurIPS 2021 • Ziping Xu, Ambuj Tewari
This motivates us to ask whether diversity can be achieved when source tasks and the target task use different prediction function spaces beyond linear functions.
no code implementations • 1 Jan 2021 • Hao Sun, Ziping Xu, Meng Fang, Yuhang Song, Jiechao Xiong, Bo Dai, Zhengyou Zhang, Bolei Zhou
Despite the remarkable progress made by the policy gradient algorithms in reinforcement learning (RL), sub-optimal policies usually result from the local exploration property of the policy gradient update.
no code implementations • 15 Oct 2020 • Ziping Xu, Amirhossein Meisami, Ambuj Tewari
We analyze both the prediction error and the regret of our algorithms.
2 code implementations • NeurIPS 2020 • Tarun Gogineni, Ziping Xu, Exequiel Punzalan, Runxuan Jiang, Joshua Kammeraad, Ambuj Tewari, Paul Zimmerman
Molecular geometry prediction of flexible molecules, or conformer search, is a long-standing challenge in computational chemistry.
no code implementations • 11 Jun 2020 • Hao Sun, Ziping Xu, Yuhang Song, Meng Fang, Jiechao Xiong, Bo Dai, Bolei Zhou
However, PG algorithms rely on exploiting the value function being learned with the first-order update locally, which results in limited sample efficiency.
no code implementations • NeurIPS 2020 • Ziping Xu, Ambuj Tewari
We study reinforcement learning in non-episodic factored Markov decision processes (FMDPs).
no code implementations • Proceedings of the Royal Society A 2017 • Shuyi Zhang, Bin Guo, Anlan Dong, Jing He, Ziping Xu, Song Xi Chen
While this statistic offered some relief for the inhabitants of the capital, we present several analyses based on Beijing's PM2. 5 data of the past 4 years at 36 monitoring sites along with meteorological data of the past 7 years.