no code implementations • 16 Nov 2022 • Khashayar Gatmiry, Thomas Kesselheim, Sahil Singla, Yifan Wang
The goal is to minimize the regret, which is the difference over $T$ rounds in the total value of the optimal algorithm that knows the distributions vs. the total value of our algorithm that learns the distributions from the partial feedback.
no code implementations • 14 Oct 2020 • Thomas Kesselheim, Sahil Singla
We study \OLVCp in both stochastic and adversarial arrival settings, and give a general procedure to reduce the problem from $d$ dimensions to a single dimension.