Search Results for author: Shinji Ito

Found 31 papers, 0 papers with code

Online $\mathrm{L}^{\natural}$-Convex Minimization

no code implementations • 26 Apr 2024 • Ken Yokoyama, Shinji Ito, Tatsuya Matsuoka, Kei Kimura, Makoto Yokoo

An existing general framework for dealing with such objective functions is the online submodular minimization.

Paper
Add Code

Follow-the-Perturbed-Leader with Fréchet-type Tail Distributions: Optimality in Adversarial Bandits and Best-of-Both-Worlds

no code implementations • 8 Mar 2024 • Jongyeong Lee, Junya Honda, Shinji Ito, Min-hwan Oh

In this paper, we establish a sufficient condition for perturbations to achieve $\mathcal{O}(\sqrt{KT})$ regrets in the adversarial setting, which covers, e. g., Fr\'{e}chet, Pareto, and Student-$t$ distributions.

Paper
Add Code

LC-Tsallis-INF: Generalized Best-of-Both-Worlds Linear Contextual Bandits

no code implementations • 5 Mar 2024 • Masahiro Kato, Shinji Ito

For this issue, this study proposes an algorithm whose regret satisfies $O(\log(T))$ in the setting when the suboptimality gap is lower-bounded.

Multi-Armed Bandits

Paper
Add Code

Adaptive Learning Rate for Follow-the-Regularized-Leader: Competitive Analysis and Best-of-Both-Worlds

no code implementations • 1 Mar 2024 • Shinji Ito, Taira Tsuchiya, Junya Honda

Follow-The-Regularized-Leader (FTRL) is known as an effective and versatile approach in online learning, where appropriate choice of the learning rate is crucial for smaller regret.

Decision Making Multi-Armed Bandits

Paper
Add Code

Fast Rates in Online Convex Optimization by Exploiting the Curvature of Feasible Sets

no code implementations • 20 Feb 2024 • Taira Tsuchiya, Shinji Ito

We first prove that if an optimal decision is on the boundary of a feasible set and the gradient of an underlying loss function is non-zero, then the algorithm achieves a regret upper bound of $O(\rho \log T)$ in stochastic environments.

Paper
Add Code

Exploration by Optimization with Hybrid Regularizers: Logarithmic Regret with Adversarial Robustness in Partial Monitoring

no code implementations • 13 Feb 2024 • Taira Tsuchiya, Shinji Ito, Junya Honda

This development allows us to significantly improve the existing regret bounds of best-of-both-worlds (BOBW) algorithms, which achieves nearly optimal bounds both in stochastic and adversarial environments.

Adversarial Robustness Decision Making

Paper
Add Code

Replicability is Asymptotically Free in Multi-armed Bandits

no code implementations • 12 Feb 2024 • Junpei Komiyama, Shinji Ito, Yuichi Yoshida, Souta Koshino

For the analysis of these algorithms, we propose a principled approach to limiting the probability of nonreplication.

Decision Making Multi-Armed Bandits

Paper
Add Code

Best-of-Both-Worlds Linear Contextual Bandits

no code implementations • 27 Dec 2023 • Masahiro Kato, Shinji Ito

The goal of this study is to develop a strategy that is effective in both stochastic and adversarial environments, with theoretical guarantees.

Multi-Armed Bandits

Paper
Add Code

New Classes of the Greedy-Applicable Arm Feature Distributions in the Sparse Linear Bandit Problem

no code implementations • 19 Dec 2023 • Koji Ichikawa, Shinji Ito, Daisuke Hatano, Hanna Sumita, Takuro Fukunaga, Naonori Kakimura, Ken-ichi Kawarabayashi

Firstly, we show that a mixture distribution that has a greedy-applicable component is also greedy-applicable.

Paper
Add Code

Stability-penalty-adaptive follow-the-regularized-leader: Sparsity, game-dependency, and best-of-both-worlds

no code implementations • NeurIPS 2023 • Taira Tsuchiya, Shinji Ito, Junya Honda

With this result, we establish several algorithms with three types of adaptivity: sparsity, game-dependency, and best-of-both-worlds (BOBW).

Decision Making Single Particle Analysis

Paper
Add Code

Best-of-Three-Worlds Linear Bandit Algorithm with Variance-Adaptive Regret Bounds

no code implementations • 24 Feb 2023 • Shinji Ito, Kei Takemura

At the higher level, the proposed algorithm adapts to a variety of types of environments.

Paper
Add Code

Best-of-Both-Worlds Algorithms for Partial Monitoring

no code implementations • 29 Jul 2022 • Taira Tsuchiya, Shinji Ito, Junya Honda

This study considers the partial monitoring problem with $k$-actions and $d$-outcomes and provides the first best-of-both-worlds algorithms, whose regrets are favorably bounded both in the stochastic and adversarial regimes.

Paper
Add Code

Adversarially Robust Multi-Armed Bandit Algorithm with Variance-Dependent Regret Bounds

no code implementations • 14 Jun 2022 • Shinji Ito, Taira Tsuchiya, Junya Honda

In fact, they have provided a stochastic MAB algorithm with gap-variance-dependent regret bounds of $O(\sum_{i: \Delta_i>0} (\frac{\sigma_i^2}{\Delta_i} + 1) \log T )$ for loss variance $\sigma_i^2$ of arm $i$.

Paper
Add Code

Nearly Optimal Best-of-Both-Worlds Algorithms for Online Learning with Feedback Graphs

no code implementations • 2 Jun 2022 • Shinji Ito, Taira Tsuchiya, Junya Honda

As Alon et al. [2015] have shown, tight regret bounds depend on the structure of the feedback graph: strongly observable graphs yield minimax regret of $\tilde{\Theta}( \alpha^{1/2} T^{1/2} )$, while weakly observable graphs induce minimax regret of $\tilde{\Theta}( \delta^{1/3} T^{2/3} )$, where $\alpha$ and $\delta$, respectively, represent the independence number of the graph and the domination number of a certain portion of the graph.

Open-Ended Question Answering

Paper
Add Code

Online Task Assignment Problems with Reusable Resources

no code implementations • 15 Mar 2022 • Hanna Sumita, Shinji Ito, Kei Takemura, Daisuke Hatano, Takuro Fukunaga, Naonori Kakimura, Ken-ichi Kawarabayashi

The key features of our problem are (1) an agent is reusable, i. e., an agent comes back to the market after completing the assigned task, (2) an agent may reject the assigned task to stay the market, and (3) a task may accommodate multiple agents.

Task 2

Paper
Add Code

Hybrid Regret Bounds for Combinatorial Semi-Bandits and Adversarial Linear Bandits

no code implementations • NeurIPS 2021 • Shinji Ito

This study aims to develop bandit algorithms that automatically exploit tendencies of certain environments to improve performance, without any prior knowledge regarding the environments.

Paper
Add Code

On Optimal Robustness to Adversarial Corruption in Online Decision Problems

no code implementations • NeurIPS 2021 • Shinji Ito

The main contribution of this paper is to show that optimal robustness can be expressed by a square-root dependency on the amount of corruption.

Decision Making

Paper
Add Code

Near-Optimal Regret Bounds for Contextual Combinatorial Semi-Bandits with Linear Payoff Functions

no code implementations • 20 Jan 2021 • Kei Takemura, Shinji Ito, Daisuke Hatano, Hanna Sumita, Takuro Fukunaga, Naonori Kakimura, Ken-ichi Kawarabayashi

However, there is a gap of $\tilde{O}(\max(\sqrt{d}, \sqrt{k}))$ between the current best upper and lower bounds, where $d$ is the dimension of the feature vectors, $k$ is the number of the chosen arms in a round, and $\tilde{O}(\cdot)$ ignores the logarithmic factors.

Decision Making Recommendation Systems

Paper
Add Code

Delay and Cooperation in Nonstochastic Linear Bandits

no code implementations • NeurIPS 2020 • Shinji Ito, Daisuke Hatano, Hanna Sumita, Kei Takemura, Takuro Fukunaga, Naonori Kakimura, Ken-ichi Kawarabayashi

This paper offers a nearly optimal algorithm for online linear optimization with delayed bandit feedback.

Decision Making

Paper
Add Code

A Tight Lower Bound and Efficient Reduction for Swap Regret

no code implementations • NeurIPS 2020 • Shinji Ito

Swap regret, a generic performance measure of online decision-making algorithms, plays an important role in the theory of repeated games, along with a close connection to correlated equilibria in strategic games.

Decision Making

Paper
Add Code

Tight First- and Second-Order Regret Bounds for Adversarial Linear Bandits

no code implementations • NeurIPS 2020 • Shinji Ito, Shuichi Hirahara, Tasuku Soma, Yuichi Yoshida

We propose novel algorithms with first- and second-order regret bounds for adversarial linear bandits.

Paper
Add Code

Submodular Function Minimization with Noisy Evaluation Oracle

no code implementations • NeurIPS 2019 • Shinji Ito

This paper considers submodular function minimization with \textit{noisy evaluation oracles} that return the function value of a submodular objective with zero-mean additive noise.

Paper
Add Code

Oracle-Efficient Algorithms for Online Linear Optimization with Bandit Feedback

no code implementations • NeurIPS 2019 • Shinji Ito, Daisuke Hatano, Hanna Sumita, Kei Takemura, Takuro Fukunaga, Naonori Kakimura, Ken-ichi Kawarabayashi

Our algorithm for non-stochastic settings has an oracle complexity of $\tilde{O}( T )$ and is the first algorithm that achieves both a regret bound of $\tilde{O}( \sqrt{T} )$ and an oracle complexity of $\tilde{O} ( \mathrm{poly} ( T ) )$, given only linear optimization oracles.

Computational Efficiency

Paper
Add Code

Improved Regret Bounds for Bandit Combinatorial Optimization

no code implementations • NeurIPS 2019 • Shinji Ito, Daisuke Hatano, Hanna Sumita, Kei Takemura, Takuro Fukunaga, Naonori Kakimura, Ken-ichi Kawarabayashi

\textit{Bandit combinatorial optimization} is a bandit framework in which a player chooses an action within a given finite set $\mathcal{A} \subseteq \{ 0, 1 \}^d$ and incurs a loss that is the inner product of the chosen action and an unobservable loss vector in $\mathbb{R} ^ d$ in each round.

Combinatorial Optimization

Paper
Add Code

An Arm-Wise Randomization Approach to Combinatorial Linear Semi-Bandits

no code implementations • 5 Sep 2019 • Kei Takemura, Shinji Ito

Our empirical evaluation with artificial and real-world datasets demonstrates that the proposed algorithms with the arm-wise randomization technique outperform the existing algorithms without this technique, especially for the clustered case.

Decision Making Recommendation Systems +1

Paper
Add Code

Regret Bounds for Online Portfolio Selection with a Cardinality Constraint

no code implementations • NeurIPS 2018 • Shinji Ito, Daisuke Hatano, Sumita Hanna, Akihiro Yabe, Takuro Fukunaga, Naonori Kakimura, Ken-ichi Kawarabayashi

Online portfolio selection is a sequential decision-making problem in which a learner repetitively selects a portfolio over a set of assets, aiming to maximize long-term return.

Computational Efficiency Decision Making

Paper
Add Code

Unbiased Objective Estimation in Predictive Optimization

no code implementations • ICML 2018 • Shinji Ito, Akihiro Yabe, Ryohei Fujimaki

Predictive optimization, however, suffers from the problem of a calculated optimal solution’s being evaluated too optimistically, i. e., the value of the objective function is overestimated.

Decision Making

Paper
Add Code

Causal Bandits with Propagating Inference

no code implementations • ICML 2018 • Akihiro Yabe, Daisuke Hatano, Hanna Sumita, Shinji Ito, Naonori Kakimura, Takuro Fukunaga, Ken-ichi Kawarabayashi

In this setting, the arms are identified with interventions on a given causal graph, and the effect of an intervention propagates throughout all over the causal graph.

Paper
Add Code

Efficient Sublinear-Regret Algorithms for Online Sparse Linear Regression with Limited Observation

no code implementations • NeurIPS 2017 • Shinji Ito, Daisuke Hatano, Hanna Sumita, Akihiro Yabe, Takuro Fukunaga, Naonori Kakimura, Ken-ichi Kawarabayashi

Under these assumptions, we present polynomial-time sublinear-regret algorithms for the online sparse linear regression.

regression

Paper
Add Code

Large-Scale Price Optimization via Network Flow

no code implementations • NeurIPS 2016 • Shinji Ito, Ryohei Fujimaki

On the basis of this connection, we propose an efficient algorithm that employs network flow algorithms.

Paper
Add Code

Optimization Beyond Prediction: Prescriptive Price Optimization

no code implementations • 18 May 2016 • Shinji Ito, Ryohei Fujimaki

This paper addresses a novel data science problem, prescriptive price optimization, which derives the optimal price strategy to maximize future profit/revenue on the basis of massive predictive formulas produced by machine learning.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.