Search Results for author: Zixin Zhong

Found 9 papers, 4 papers with code

Stochastic Gradient Succeeds for Bandits

no code implementations27 Feb 2024 Jincheng Mei, Zixin Zhong, Bo Dai, Alekh Agarwal, Csaba Szepesvari, Dale Schuurmans

We show that the \emph{stochastic gradient} bandit algorithm converges to a \emph{globally optimal} policy at an $O(1/t)$ rate, even with a \emph{constant} step size.

Probably Anytime-Safe Stochastic Combinatorial Semi-Bandits

1 code implementation31 Jan 2023 Yunlong Hou, Vincent Y. F. Tan, Zixin Zhong

Under this constraint, we design and analyze an algorithm {\sc PASCombUCB} that minimizes the regret over the horizon of time $T$.

Recommendation Systems

Fast Beam Alignment via Pure Exploration in Multi-armed Bandits

1 code implementation23 Oct 2022 Yi Wei, Zixin Zhong, Vincent Y. F. Tan

The beam alignment (BA) problem consists in accurately aligning the transmitter and receiver beams to establish a reliable communication link in wireless communication systems.

Multi-Armed Bandits

Optimal Clustering with Bandit Feedback

no code implementations9 Feb 2022 Junwen Yang, Zixin Zhong, Vincent Y. F. Tan

This paper considers the problem of online clustering with bandit feedback.

Clustering Online Clustering

Almost Optimal Variance-Constrained Best Arm Identification

1 code implementation25 Jan 2022 Yunlong Hou, Vincent Y. F. Tan, Zixin Zhong

We design and analyze VA-LUCB, a parameter-free algorithm, for identifying the best arm under the fixed-confidence setup and under a stringent constraint that the variance of the chosen arm is strictly smaller than a given threshold.

Achieving the Pareto Frontier of Regret Minimization and Best Arm Identification in Multi-Armed Bandits

no code implementations16 Oct 2021 Zixin Zhong, Wang Chi Cheung, Vincent Y. F. Tan

We study the Pareto frontier of two archetypal objectives in multi-armed bandits, namely, regret minimization (RM) and best arm identification (BAI) with a fixed horizon.

Multi-Armed Bandits

Probabilistic Sequential Shrinking: A Best Arm Identification Algorithm for Stochastic Bandits with Corruptions

1 code implementation15 Oct 2020 Zixin Zhong, Wang Chi Cheung, Vincent Y. F. Tan

When the amount of corruptions per step (CPS) is below a threshold, PSS($u$) identifies the best arm or item with probability tending to $1$ as $T\rightarrow \infty$.

Best Arm Identification for Cascading Bandits in the Fixed Confidence Setting

no code implementations ICML 2020 Zixin Zhong, Wang Chi Cheung, Vincent Y. F. Tan

Finally, extensive numerical simulations corroborate the efficacy of CascadeBAI as well as the tightness of our upper bound on its time complexity.

Thompson Sampling Algorithms for Cascading Bandits

no code implementations2 Oct 2018 Zixin Zhong, Wang Chi Cheung, Vincent Y. F. Tan

While Thompson sampling (TS) algorithms have been shown to be empirically superior to Upper Confidence Bound (UCB) algorithms for cascading bandits, theoretical guarantees are only known for the latter.

Efficient Exploration Multi-Armed Bandits +2

Cannot find the paper you are looking for? You can Submit a new open access paper.