You need to log in to edit.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

no code implementations • 29 Jul 2022 • Taira Tsuchiya, Shinji Ito, Junya Honda

To be more specific, we show that for non-degenerate locally observable games, the regret in the stochastic regime is bounded by $O(k^3 m^2 \log(T) \log(k_{\Pi} T) / \Delta_{\mathrm{\min}})$ and in the adversarial regime by $O(k^{2/3} m \sqrt{T \log(T) \log k_{\Pi}})$, where $T$ is the number of rounds, $m$ is the maximum number of distinct observations per action, $\Delta_{\min}$ is the minimum optimality gap, and $k_{\Pi}$ is the number of Pareto optimal actions.

no code implementations • 14 Jun 2022 • Shinji Ito, Taira Tsuchiya, Junya Honda

In fact, they have provided a stochastic MAB algorithm with gap-variance-dependent regret bounds of $O(\sum_{i: \Delta_i>0} (\frac{\sigma_i^2}{\Delta_i} + 1) \log T )$ for loss variance $\sigma_i^2$ of arm $i$.

no code implementations • 9 Jun 2022 • Junpei Komiyama, Taira Tsuchiya, Junya Honda

We consider the fixed-budget best arm identification problem where the goal is to find the arm of the largest mean with a fixed number of samples.

no code implementations • 2 Jun 2022 • Shinji Ito, Taira Tsuchiya, Junya Honda

As Alon et al. [2015] have shown, tight regret bounds depend on the structure of the feedback graph: \textit{strongly observable} graphs yield minimax regret of $\tilde{\Theta}( \alpha^{1/2} T^{1/2} )$, while \textit{weakly observable} graphs induce minimax regret of $\tilde{\Theta}( \delta^{1/3} T^{2/3} )$, where $\alpha$ and $\delta$, respectively, represent the independence number of the graph and the domination number of a certain portion of the graph.

no code implementations • NeurIPS 2020 • Taira Tsuchiya, Junya Honda, Masashi Sugiyama

We investigate finite stochastic partial monitoring, which is a general model for sequential learning with limited feedback.

no code implementations • 31 Jan 2019 • Taira Tsuchiya, Nontawat Charoenphakdee, Issei Sato, Masashi Sugiyama

We further provide an estimation error bound to show that our risk estimator is consistent.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.