Search Results for author: Yulian Wu

Better-than-KL PAC-Bayes Bounds

In this paper, we consider the problem of proving concentration inequalities to estimate the mean of the sequence.

Paper
Add Code

Under each framework, we consider both joint differential privacy (JDP) and local differential privacy (LDP) models.

Paper
Add Code

This improvement is a key to the significant regret improvement in quantum reinforcement learning.

Paper
Add Code

In this paper, we study multi-armed bandits (MAB) and stochastic linear bandits (SLB) with heavy-tailed rewards and quantum reward oracle.

Paper
Add Code

Finally, we establish the lower bound to show that the instance-dependent regret of our improved algorithm is optimal.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.