Search Results for author: Jiatai Huang

Found 6 papers, 1 papers with code

Queue Scheduling with Adversarial Bandit Learning

no code implementations3 Mar 2023 Jiatai Huang, Leana Golubchik, Longbo Huang

In this paper, we study scheduling of a queueing system with zero knowledge of instantaneous network conditions.

Multi-Armed Bandits Scheduling

Banker Online Mirror Descent: A Universal Approach for Delayed Online Bandit Learning

no code implementations25 Jan 2023 Jiatai Huang, Yan Dai, Longbo Huang

\texttt{Banker-OMD} leads to the first delayed scale-free adversarial MAB algorithm achieving $\widetilde{\mathcal O}(\sqrt{K}L(\sqrt T+\sqrt D))$ regret and the first delayed adversarial linear bandit algorithm achieving $\widetilde{\mathcal O}(\text{poly}(n)(\sqrt{T} + \sqrt{D}))$ regret.

Multi-Armed Bandits

Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed Bandits

no code implementations28 Jan 2022 Jiatai Huang, Yan Dai, Longbo Huang

Specifically, we design an algorithm \texttt{HTINF}, when the heavy-tail parameters $\alpha$ and $\sigma$ are known to the agent, \texttt{HTINF} simultaneously achieves the optimal regret for both stochastic and adversarial environments, without knowing the actual environment type a-priori.

Multi-Armed Bandits

Scale-Free Adversarial Multi-Armed Bandit with Arbitrary Feedback Delays

no code implementations26 Oct 2021 Jiatai Huang, Yan Dai, Longbo Huang

We consider the Scale-Free Adversarial Multi-Armed Bandit (MAB) problem with unrestricted feedback delays.

Banker Online Mirror Descent

no code implementations16 Jun 2021 Jiatai Huang, Longbo Huang

In particular, it leads to the first delayed adversarial linear bandit algorithm achieving $\tilde{O}(\text{poly}(n)(\sqrt{T} + \sqrt{D}))$ regret.

Multi-Armed Bandits

Cannot find the paper you are looking for? You can Submit a new open access paper.