no code implementations • 10 Mar 2022 • Danielle Cabel, Shonosuke Sugasawa, Masahiro Kato, Kosaku Takanashi, Kenichiro McAlinn
Spatial data are characterized by their spatial dependence, which is often complex, non-linear, and difficult to capture with a single model.
no code implementations • 10 Feb 2022 • Masahiro Kato, Masaaki Imaizumi
We study the benign overfitting theory in the prediction of the conditional average treatment effect (CATE), with linear regression models.
no code implementations • 31 Jan 2022 • Masahiro Kato, Masaaki Imaizumi, Kentaro Minami
This paper provides a unified perspective for the Kullback-Leibler (KL)-divergence and the integral probability metrics (IPMs) from the perspective of maximum likelihood density-ratio estimation (DRE).
no code implementations • 12 Jan 2022 • Masahiro Kato, Kaito Ariu, Masaaki Imaizumi, Masahiro Nomura, Chao Qin
One of the longstanding open questions is a tight lower bound on the probability of misidentifying the best arm and a strategy whose upper bound matches the lower bound when the optimal target allocation ratio of arm draws is unknown.
no code implementations • NeurIPS 2021 • Masahiro Kato, Kenichiro McAlinn, Shota Yasui
This paper proposes a DR estimator for dependent samples obtained from adaptive experiments.
1 code implementation • 18 Nov 2021 • Junpei Komiyama, Kaito Ariu, Masahiro Kato, Chao Qin
We consider Bayesian best arm identification in the multi-armed bandit problem.
no code implementations • ICLR 2022 • Masahiro Kato, Masaaki Imaizumi, Kenichiro McAlinn, Shota Yasui, Haruo Kakehi
We consider learning causal relationships under conditional moment restrictions.
no code implementations • 16 Sep 2021 • Kaito Ariu, Masahiro Kato, Junpei Komiyama, Kenichiro McAlinn, Chao Qin
We consider the "policy choice" problem -- otherwise known as best arm identification in the bandit literature -- proposed by Kasy and Sautmann (2021) for adaptive experimental design.
no code implementations • 3 Aug 2021 • Masahiro Kato, Haruo Kakehi, Kenichiro McAlinn, Shota Yasui
We consider learning causal relationships under conditional moment conditions.
1 code implementation • 26 Jun 2021 • Masahiro Kato, Kaito Ariu
We demonstrate that contextual information can be used to improve the efficiency of the identification of the best marginalized mean reward compared with the results of Garivier & Kaufmann (2016).
no code implementations • 11 May 2021 • Riku Togashi, Masahiro Kato, Mayu Otani, Tetsuya Sakai, Shin'ichi Satoh
However, such methods have two main drawbacks particularly in large-scale applications; (1) the pairwise approach is severely inefficient due to the quadratic computational cost; and (2) even recent model-based samplers (e. g. IRGAN) cannot achieve practical efficiency due to the training of an extra model.
no code implementations • 17 Feb 2021 • Masahiro Kato
To mitigate this limitation, we propose another assumption that the average logging policy converges to a time-invariant function and show the doubly robust (DR) estimator's asymptotic normality.
no code implementations • 19 Jan 2021 • Riku Togashi, Masahiro Kato, Mayu Otani, Shin'ichi Satoh
Learning from implicit user feedback is challenging as we can only observe positive samples but never access negative ones.
no code implementations • 24 Oct 2020 • Masahiro Kato, Zhenghang Cui, Yoshihiro Fukuhara
In this paper, in order to acquire a more reliable classifier against adversarial attacks, we propose the method of Adversarial Training with a Rejection Option (ATRO).
no code implementations • 23 Oct 2020 • Masahiro Kato, Yusuke Kaneko
The goal of off-policy evaluation (OPE) is to evaluate a new policy using historical data obtained via a behavior policy.
no code implementations • 23 Oct 2020 • Masahiro Kato, Kenshi Abe, Kaito Ariu, Shota Yasui
Based on the properties of the evaluation policy, we categorize OPE situations.
no code implementations • 8 Oct 2020 • Masahiro Kato, Shota Yasui, Kenichiro McAlinn
This paper proposes a DR estimator for dependent samples obtained from adaptive experiments.
no code implementations • 3 Oct 2020 • Masahiro Kato, Kei Nakagawa, Kenshi Abe, Tetsuro Morimura
To achieve this purpose, we train an agent to maximize the expected quadratic utility function, a common objective of risk management in finance and economics.
no code implementations • 28 Sep 2020 • Masahiro Kato, Kei Nakagawa
In this paper, we suggest expected quadratic utility maximization (EQUM) as a new framework for policy gradient style reinforcement learning (RL) algorithms with mean-variance control.
no code implementations • 28 Sep 2020 • Masahiro Kato, Shota Yasui
We consider training a binary classifier under delayed feedback (\emph{DF learning}).
1 code implementation • 12 Jun 2020 • Masahiro Kato, Takeshi Teshima
Density ratio estimation (DRE) is at the core of various machine learning tasks such as anomaly detection and domain adaptation.
no code implementations • 12 Jun 2020 • Masahiro Kato
The goal of OPE is to evaluate a new policy using historical data obtained from behavior policies generated by the bandit algorithm.
1 code implementation • NeurIPS 2020 • Masahiro Kato, Masatoshi Uehara, Shota Yasui
Then, we propose doubly robust and efficient estimators for OPE and OPL under a covariate shift by using a nonparametric estimator of the density ratio between the historical and evaluation data distributions.
no code implementations • 13 Feb 2020 • Masahiro Kato, Takuya Ishihara, Junya Honda, Yusuke Narita
In adaptive experimental design, the experimenter is allowed to change the probability of assigning a treatment using past observations for estimating the ATE efficiently.
no code implementations • 2 Nov 2019 • Masahiro Kato, Hikaru Kawarazaki
By applying the proposed method, we can obtain a model that predicts the label for the unlabeled test data well without losing the interpretability of the model.
no code implementations • 25 Sep 2019 • Masahiro Kato, Yoshihiro Fukuhara, Hirokatsu Kataoka, Shigeo Morishima
Our main idea is to apply a framework of learning with rejection and adversarial examples to assist in the decision making for such suspicious samples.
1 code implementation • ICLR 2019 • Masahiro Kato, Takeshi Teshima, Junya Honda
However, this assumption is unrealistic in many instances of PU learning because it fails to capture the existence of a selection bias in the labeling process.
no code implementations • 15 Sep 2018 • Masahiro Kato, Liyuan Xu, Gang Niu, Masashi Sugiyama
In this paper, we propose a novel unified approach to estimating the class-prior and training a classifier alternately.