no code implementations • 20 Feb 2024 • Junyan Liu, Yunfan Li, Lin Yang
This paper introduces a stronger performance measure, the uniform last-iterate (ULI) guarantee, capturing both cumulative and instantaneous performance of bandit algorithms.
no code implementations • 8 Jun 2021 • Junyan Liu, Shuai Li, Dapeng Li
Our algorithm not only achieves near-optimal regret in the stochastic setting, but also obtains a regret with an additive term of corruption in the corrupted setting, while maintaining efficient communication.