no code implementations • NeurIPS 2020 • Gellert Weisz, András György, Wei-I Lin, Devon Graham, Kevin Leyton-Brown, Csaba Szepesvari, Brendan Lucier
Algorithm configuration procedures optimize parameters of a given algorithm to perform well over a distribution of inputs.
no code implementations • ICML 2020 • Tor Lattimore, Csaba Szepesvari, Gellert Weisz
The construction by Du et al. (2019) implies that even if a learner is given linear features in $\mathbb R^d$ that approximate the rewards in a bandit with a uniform error of $\epsilon$, then searching for an action that is optimal up to $O(\epsilon)$ requires examining essentially all actions.
no code implementations • 27 Aug 2019 • Yasin Abbasi-Yadkori, Nevena Lazic, Csaba Szepesvari, Gellert Weisz
POLITEX has sublinear regret guarantees in uniformly-mixing MDPs when the value estimation error can be controlled, which can be satisfied if all policies sufficiently explore the environment.
no code implementations • ICML 2018 • Gellert Weisz, Andras Gyorgy, Csaba Szepesvari
We consider the problem of configuring general-purpose solvers to run efficiently on problem instances drawn from an unknown distribution.