Bandits Dueling on Partially Ordered Sets

NeurIPS 2017  ·  Julien Audiffren, Liva Ralaivola ·

We address the problem of dueling bandits defined on partially ordered sets, or posets. In this setting, arms may not be comparable, and there may be several (incomparable) optimal arms. We propose an algorithm, UnchainedBandits, that efficiently finds the set of optimal arms, or Pareto front, of any poset even when pairs of comparable arms cannot be a priori distinguished from pairs of incomparable arms, with a set of minimal assumptions. This means that UnchainedBandits does not require information about comparability and can be used with limited knowledge of the poset. To achieve this, the algorithm relies on the concept of decoys, which stems from social psychology. We also provide theoretical guarantees on both the regret incurred and the number of comparison required by UnchainedBandits, and we report compelling empirical results.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here