You need to log in to edit.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

no code implementations • 29 Jul 2022 • Taira Tsuchiya, Shinji Ito, Junya Honda

To be more specific, we show that for non-degenerate locally observable games, the regret in the stochastic regime is bounded by $O(k^3 m^2 \log(T) \log(k_{\Pi} T) / \Delta_{\mathrm{\min}})$ and in the adversarial regime by $O(k^{2/3} m \sqrt{T \log(T) \log k_{\Pi}})$, where $T$ is the number of rounds, $m$ is the maximum number of distinct observations per action, $\Delta_{\min}$ is the minimum optimality gap, and $k_{\Pi}$ is the number of Pareto optimal actions.

no code implementations • 14 Jun 2022 • Shinji Ito, Taira Tsuchiya, Junya Honda

In fact, they have provided a stochastic MAB algorithm with gap-variance-dependent regret bounds of $O(\sum_{i: \Delta_i>0} (\frac{\sigma_i^2}{\Delta_i} + 1) \log T )$ for loss variance $\sigma_i^2$ of arm $i$.

no code implementations • 9 Jun 2022 • Junpei Komiyama, Taira Tsuchiya, Junya Honda

We consider the fixed-budget best arm identification problem where the goal is to find the arm of the largest mean with a fixed number of samples.

no code implementations • 7 Jun 2022 • Charles Riou, Junya Honda, Masashi Sugiyama

We study the survival bandit problem, a variant of the multi-armed bandit problem introduced in an open problem by Perotto et al. (2019), with a constraint on the cumulative reward; at each time step, the agent receives a (possibly negative) reward and if the cumulative reward becomes lower than a prespecified threshold, the procedure stops, and this phenomenon is called ruin.

no code implementations • 2 Jun 2022 • Shinji Ito, Taira Tsuchiya, Junya Honda

As Alon et al. [2015] have shown, tight regret bounds depend on the structure of the feedback graph: \textit{strongly observable} graphs yield minimax regret of $\tilde{\Theta}( \alpha^{1/2} T^{1/2} )$, while \textit{weakly observable} graphs induce minimax regret of $\tilde{\Theta}( \delta^{1/3} T^{2/3} )$, where $\alpha$ and $\delta$, respectively, represent the independence number of the graph and the domination number of a certain portion of the graph.

1 code implementation • 23 Jul 2021 • Junpei Komiyama, Edouard Fouché, Junya Honda

We demonstrate that ADR-bandit has nearly optimal performance when the abrupt or global changes occur in a coordinated manner that we call global changes.

1 code implementation • 16 Jul 2021 • Ikko Yamane, Junya Honda, Florian Yger, Masashi Sugiyama

In this paper, we consider the task of predicting $Y$ from $X$ when we have no paired data of them, but we have two separate, independent datasets of $X$ and $Y$ each observed with some mediating variable $U$, that is, we have two datasets $S_X = \{(X_i, U_i)\}$ and $S_Y = \{(U'_j, Y'_j)\}$.

no code implementations • 31 Dec 2020 • Yuko Kuroki, Junya Honda, Masashi Sugiyama

Combinatorial optimization is one of the fundamental research fields that has been extensively studied in theoretical computer science and operations research.

no code implementations • ICML 2020 • Yuko Kuroki, Atsushi Miyauchi, Junya Honda, Masashi Sugiyama

Dense subgraph discovery aims to find a dense component in edge-weighted graphs.

no code implementations • NeurIPS 2020 • Taira Tsuchiya, Junya Honda, Masashi Sugiyama

We investigate finite stochastic partial monitoring, which is a general model for sequential learning with limited feedback.

no code implementations • 10 Mar 2020 • Hideaki Imamura, Nontawat Charoenphakdee, Futoshi Futami, Issei Sato, Junya Honda, Masashi Sugiyama

If the black-box function varies with time, then time-varying Bayesian optimization is a promising framework.

no code implementations • 13 Feb 2020 • Masahiro Kato, Takuya Ishihara, Junya Honda, Yusuke Narita

In adaptive experimental design, the experimenter is allowed to change the probability of assigning a treatment using past observations for estimating the ATE efficiently.

1 code implementation • NeurIPS 2019 • Liyuan Xu, Junya Honda, Gang Niu, Masashi Sugiyama

We propose two practical methods for uncoupled regression from pairwise comparison data and show that the learned regression model converges to the optimal model with the optimal parametric convergence rate when the target variable distributes uniformly.

1 code implementation • ICLR 2019 • Masahiro Kato, Takeshi Teshima, Junya Honda

However, this assumption is unrealistic in many instances of PU learning because it fails to capture the existence of a selection bias in the labeling process.

no code implementations • 19 Mar 2019 • Junya Honda

A classic setting of the stochastic K-armed bandit problem is considered in this note.

no code implementations • 27 Feb 2019 • Yuko Kuroki, Liyuan Xu, Atsushi Miyauchi, Junya Honda, Masashi Sugiyama

Based on our approximation algorithm, we propose novel bandit algorithms for the top-k selection problem, and prove that our algorithms run in polynomial time.

no code implementations • 31 Jan 2019 • Koji Tabata, Atsuyoshi Nakamura, Junya Honda, Tamiki Komatsuzaki

We study a bad arm existing checking problem in which a player's task is to judge whether a positive arm exists or not among given K arms by drawing as small number of arms as possible.

1 code implementation • NeurIPS 2019 • Chenri Ni, Nontawat Charoenphakdee, Junya Honda, Masashi Sugiyama

First, we consider an approach based on simultaneous training of a classifier and a rejector, which achieves the state-of-the-art performance in the binary case.

no code implementations • 14 Sep 2018 • Liyuan Xu, Junya Honda, Masashi Sugiyama

We formulate and study a novel multi-armed bandit problem called the qualitative dueling bandit (QDB) problem, where an agent observes not numeric but qualitative feedback by pulling each arm.

no code implementations • 11 Sep 2018 • Seiichi Kuroki, Nontawat Charoenphakdee, Han Bao, Junya Honda, Issei Sato, Masashi Sugiyama

A previously proposed discrepancy that does not use the source domain labels requires high computational cost to estimate and may lead to a loose generalization error bound in the target domain.

1 code implementation • ICML 2018 • Junpei Komiyama, Akiko Takeda, Junya Honda, Hajime Shimao

However, a fairness level as a constraint induces a nonconvexity of the feasible region, which disables the use of an off-the-shelf convex optimizer.

no code implementations • NeurIPS 2017 • Junpei Komiyama, Junya Honda, Akiko Takeda

Motivated by online advertising, we study a multiple-play multi-armed bandit problem with position bias that involves several slots and the latter slots yield fewer rewards.

no code implementations • 17 Oct 2017 • Hideaki Kano, Junya Honda, Kentaro Sakamaki, Kentaro Matsuura, Atsuyoshi Nakamura, Masashi Sugiyama

We consider a novel stochastic multi-armed bandit problem called {\em good arm identification} (GAI), where a good arm is defined as an arm with expected reward greater than or equal to a given threshold.

no code implementations • 16 Oct 2017 • Liyuan Xu, Junya Honda, Masashi Sugiyama

We propose the first fully-adaptive algorithm for pure exploration in linear bandits---the task to find the arm with the largest expected reward, which depends on an unknown parameter linearly.

no code implementations • 5 May 2016 • Junpei Komiyama, Junya Honda, Hiroshi Nakagawa

We study the K-armed dueling bandit problem, a variation of the standard stochastic bandit problem where the feedback is limited to relative comparisons of a pair of arms.

no code implementations • NeurIPS 2015 • Junpei Komiyama, Junya Honda, Hiroshi Nakagawa

To show the optimality of PM-DMED with respect to the regret bound, we slightly modify the algorithm by introducing a hinge function (PM-DMED-Hinge).

1 code implementation • 8 Jun 2015 • Junpei Komiyama, Junya Honda, Hisashi Kashima, Hiroshi Nakagawa

We study the $K$-armed dueling bandit problem, a variation of the standard stochastic bandit problem where the feedback is limited to relative comparisons of a pair of arms.

1 code implementation • 2 Jun 2015 • Junpei Komiyama, Junya Honda, Hiroshi Nakagawa

Recently, Thompson sampling (TS), a randomized algorithm with a Bayesian spirit, has attracted much attention for its empirically excellent performance, and it is revealed to have an optimal regret bound in the standard single-play MAB problem.

no code implementations • 22 Apr 2015 • Wesley Cowan, Junya Honda, Michael N. Katehakis

Consider the problem of sampling sequentially from a finite number of $N \geq 2$ populations, specified by random variables $X^i_k$, $ i = 1,\ldots , N,$ and $k = 1, 2, \ldots$; where $X^i_k$ denotes the outcome from population $i$ the $k^{th}$ time it is sampled.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.