Search Results for author: Yaqi Duan

Found 14 papers, 2 papers with code

Taming "data-hungry" reinforcement learning? Stability in continuous state-action spaces

no code implementations • 10 Jan 2024 • Yaqi Duan, Martin J. Wainwright

We introduce a novel framework for analyzing reinforcement learning (RL) in continuous state-action spaces, and use it to prove fast rates of convergence in both off-line and on-line settings.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Policy evaluation from a single path: Multi-step methods, mixing and mis-specification

no code implementations • 7 Nov 2022 • Yaqi Duan, Martin J. Wainwright

For a given TD method applied to a well-specified model, its statistical error under trajectory data is similar to that of i. i. d.

Paper
Add Code

Near-optimal Offline Reinforcement Learning with Linear Representation: Leveraging Variance Information with Pessimism

no code implementations • 11 Mar 2022 • Ming Yin, Yaqi Duan, Mengdi Wang, Yu-Xiang Wang

However, a precise understanding of the statistical limits with function representations, remains elusive, even when such a representation is linear.

Decision Making reinforcement-learning +1

Paper
Add Code

Adaptive and Robust Multi-Task Learning

1 code implementation • 10 Feb 2022 • Yaqi Duan, Kaizheng Wang

We study the multi-task learning problem that aims to simultaneously analyze multiple datasets collected from different sources and learn one model for each of them.

Multi-Task Learning

Paper
Code

Optimal policy evaluation using kernel-based temporal difference methods

no code implementations • 24 Sep 2021 • Yaqi Duan, Mengdi Wang, Martin J. Wainwright

Whereas existing worst-case theory predicts cubic scaling ($H^3$) in the effective horizon, our theory reveals that there is in fact a much wider range of scalings, depending on the kernel, the stationary distribution, and the variance of the Bellman residual error.

Paper
Add Code

PU-Flow: a Point Cloud Upsampling Network with Normalizing Flows

1 code implementation • 13 Jul 2021 • Aihua Mao, Zihui Du, Junhui Hou, Yaqi Duan, Yong-Jin Liu, Ying He

Point cloud upsampling aims to generate dense point clouds from given sparse ones, which is a challenging task due to the irregular and unordered nature of point sets.

point cloud upsampling

Paper
Code

Learning Good State and Action Representations via Tensor Decomposition

no code implementations • 3 May 2021 • Chengzhuo Ni, Yaqi Duan, Munther Dahleh, Anru Zhang, Mengdi Wang

The transition kernel of a continuous-state-action Markov decision process (MDP) admits a natural tensor structure.

Tensor Decomposition

Paper
Add Code

Risk Bounds and Rademacher Complexity in Batch Reinforcement Learning

no code implementations • 25 Mar 2021 • Yaqi Duan, Chi Jin, Zhiyuan Li

Concretely, we view the Bellman error as a surrogate loss for the optimality gap, and prove the followings: (1) In double sampling regime, the excess risk of Empirical Risk Minimizer (ERM) is bounded by the Rademacher complexity of the function class.

Learning Theory reinforcement-learning +1

Paper
Add Code

Bootstrapping Fitted Q-Evaluation for Off-Policy Inference

no code implementations • 6 Feb 2021 • Botao Hao, Xiang Ji, Yaqi Duan, Hao Lu, Csaba Szepesvári, Mengdi Wang

Bootstrapping provides a flexible and effective approach for assessing the quality of batch reinforcement learning, yet its theoretical property is less understood.

Off-policy evaluation

Paper
Add Code

Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient

no code implementations • 8 Nov 2020 • Botao Hao, Yaqi Duan, Tor Lattimore, Csaba Szepesvári, Mengdi Wang

To evaluate a new target policy, we analyze a Lasso fitted Q-evaluation method and establish a finite-sample error bound that has no polynomial dependence on the ambient dimension.

feature selection Model Selection +2

Paper
Add Code

Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation

no code implementations • ICML 2020 • Yaqi Duan, Mengdi Wang

We prove that this method is information-theoretically optimal and has nearly minimal estimation error.

Off-policy evaluation

Paper
Add Code

Learning low-dimensional state embeddings and metastable clusters from time series data

no code implementations • NeurIPS 2019 • Yifan Sun, Yaqi Duan, Hao Gong, Mengdi Wang

This paper studies how to find compact state embeddings from high-dimensional Markov state trajectories, where the transition kernel has a small intrinsic rank.

Clustering Time Series +1

Paper
Add Code

State Aggregation Learning from Markov Transition Data

no code implementations • NeurIPS 2019 • Yaqi Duan, Zheng Tracy Ke, Mengdi Wang

Our proposed method is a simple two-step algorithm: The first step is spectral decomposition of empirical transition matrix, and the second step conducts a linear transformation of singular vectors to find their approximate convex hull.

Paper
Add Code

Adaptive Low-Nonnegative-Rank Approximation for State Aggregation of Markov Chains

no code implementations • 14 Oct 2018 • Yaqi Duan, Mengdi Wang, Zaiwen Wen, Yaxiang Yuan

The efficiency and statistical properties of our approach are illustrated on synthetic data.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.