no code implementations • 5 Jul 2016 • Yahel David, Dotan Di Castro, Zohar Karnin
Our optimization problem is formulated as an MDP where the action space is of a combinatorial nature as we recommend in each round, multiple items.
no code implementations • 23 Dec 2015 • Yahel David, Nahum Shimkin
Under the PAC framework, we provide a lower bound on the sample complexity of any $(\epsilon,\delta)$-correct algorithm, and propose an algorithm that attains this bound up to logarithmic factors.
no code implementations • 23 Aug 2015 • Yahel David, Nahum Shimkin
We consider the Max $K$-Armed Bandit problem, where a learning agent is faced with several sources (arms) of items (rewards), and interested in finding the best item overall.