Search Results for author: Gellert Weisz

Found 4 papers, 0 papers with code

Learning with Good Feature Representations in Bandits and in RL with a Generative Model

no code implementations ICML 2020 Tor Lattimore, Csaba Szepesvari, Gellert Weisz

The construction by Du et al. (2019) implies that even if a learner is given linear features in $\mathbb R^d$ that approximate the rewards in a bandit with a uniform error of $\epsilon$, then searching for an action that is optimal up to $O(\epsilon)$ requires examining essentially all actions.

Exploration-Enhanced POLITEX

no code implementations27 Aug 2019 Yasin Abbasi-Yadkori, Nevena Lazic, Csaba Szepesvari, Gellert Weisz

POLITEX has sublinear regret guarantees in uniformly-mixing MDPs when the value estimation error can be controlled, which can be satisfied if all policies sufficiently explore the environment.

LeapsAndBounds: A Method for Approximately Optimal Algorithm Configuration

no code implementations ICML 2018 Gellert Weisz, Andras Gyorgy, Csaba Szepesvari

We consider the problem of configuring general-purpose solvers to run efficiently on problem instances drawn from an unknown distribution.

Cannot find the paper you are looking for? You can Submit a new open access paper.