Search Results for author: Ziping Xu

Found 15 papers, 1 papers with code

The Fallacy of Minimizing Local Regret in the Sequential Task Setting

no code implementations • 16 Mar 2024 • Ziping Xu, Kelly W. Zhang, Susan A. Murphy

In the realm of Reinforcement Learning (RL), online RL is often conceptualized as an optimization problem, where an algorithm interacts with an unknown environment to minimize cumulative regret.

Reinforcement Learning (RL)

Paper
Add Code

A Natural Extension To Online Algorithms For Hybrid RL With Limited Coverage

no code implementations • 7 Mar 2024 • Kevin Tan, Ziping Xu

Hybrid Reinforcement Learning (RL), leveraging both online and offline data, has garnered recent interest, yet research on its provable benefits remains sparse.

Efficient Exploration Reinforcement Learning (RL)

Paper
Add Code

Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks

no code implementations • 3 Mar 2024 • Ziping Xu, Zifan Xu, Runxuan Jiang, Peter Stone, Ambuj Tewari

Multitask Reinforcement Learning (MTRL) approaches have gained increasing attention for its wide applications in many important Reinforcement Learning (RL) tasks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Online learning in bandits with predicted context

no code implementations • 26 Jul 2023 • Yongyi Guo, Ziping Xu, Susan Murphy

When the context error is non-vanishing, classical bandit algorithms fail to achieve sublinear regret.

Decision Making

Paper
Add Code

Adaptive Sampling for Discovery

no code implementations • 30 May 2022 • Ziping Xu, Eunjae Shim, Ambuj Tewari, Paul Zimmerman

Starting with a large unlabeled dataset, algorithms for ASD adaptively label the points with the goal to maximize the sum of responses.

Decision Making Drug Discovery

Paper
Add Code

On the Statistical Benefits of Curriculum Learning

no code implementations • 13 Nov 2021 • Ziping Xu, Ambuj Tewari

For both settings, we derive the minimax rates for CL with the oracle that provides the optimal curriculum and without the oracle, where the agent has to adaptively learn a good curriculum.

Paper
Add Code

Bandit Algorithms for Precision Medicine

no code implementations • 10 Aug 2021 • Yangyi Lu, Ziping Xu, Ambuj Tewari

However, the modern precision medicine movement has been enabled by a confluence of events: scientific advances in fields such as genetics and pharmacology, technological advances in mobile devices and wearable sensors, and methodological advances in computing and data sciences.

Paper
Add Code

Safe Exploration by Solving Early Terminated MDP

no code implementations • 9 Jul 2021 • Hao Sun, Ziping Xu, Meng Fang, Zhenghao Peng, Jiadong Guo, Bo Dai, Bolei Zhou

Safe exploration is crucial for the real-world application of reinforcement learning (RL).

Reinforcement Learning (RL) Safe Exploration

Paper
Add Code

Representation Learning Beyond Linear Prediction Functions

no code implementations • NeurIPS 2021 • Ziping Xu, Ambuj Tewari

This motivates us to ask whether diversity can be achieved when source tasks and the target task use different prediction function spaces beyond linear functions.

Representation Learning

Paper
Add Code

Self-Supervised Continuous Control without Policy Gradient

no code implementations • 1 Jan 2021 • Hao Sun, Ziping Xu, Meng Fang, Yuhang Song, Jiechao Xiong, Bo Dai, Zhengyou Zhang, Bolei Zhou

Despite the remarkable progress made by the policy gradient algorithms in reinforcement learning (RL), sub-optimal policies usually result from the local exploration property of the policy gradient update.

Continuous Control Policy Gradient Methods +3

Paper
Add Code

Decision Making Problems with Funnel Structure: A Multi-Task Learning Approach with Application to Email Marketing Campaigns

no code implementations • 15 Oct 2020 • Ziping Xu, Amirhossein Meisami, Ambuj Tewari

We analyze both the prediction error and the regret of our algorithms.

Decision Making Marketing +1

Paper
Add Code

TorsionNet: A Reinforcement Learning Approach to Sequential Conformer Search

2 code implementations • NeurIPS 2020 • Tarun Gogineni, Ziping Xu, Exequiel Punzalan, Runxuan Jiang, Joshua Kammeraad, Ambuj Tewari, Paul Zimmerman

Molecular geometry prediction of flexible molecules, or conformer search, is a long-standing challenge in computational chemistry.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Zeroth-Order Supervised Policy Improvement

no code implementations • 11 Jun 2020 • Hao Sun, Ziping Xu, Yuhang Song, Meng Fang, Jiechao Xiong, Bo Dai, Bolei Zhou

However, PG algorithms rely on exploiting the value function being learned with the first-order update locally, which results in limited sample efficiency.

Continuous Control Policy Gradient Methods +2

Paper
Add Code

Reinforcement Learning in Factored MDPs: Oracle-Efficient Algorithms and Tighter Regret Bounds for the Non-Episodic Setting

no code implementations • NeurIPS 2020 • Ziping Xu, Ambuj Tewari

We study reinforcement learning in non-episodic factored Markov decision processes (FMDPs).

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Cautionary Tales on Air-Quality Improvement in Beijing

no code implementations • Proceedings of the Royal Society A 2017 • Shuyi Zhang, Bin Guo, Anlan Dong, Jing He, Ziping Xu, Song Xi Chen

While this statistic offered some relief for the inhabitants of the capital, we present several analyses based on Beijing's PM2. 5 data of the past 4 years at 36 monitoring sites along with meteorological data of the past 7 years.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.