Search Results for author: Masahiro Kato

Found 42 papers, 5 papers with code

Non-Negative Bregman Divergence Minimization for Deep Direct Density Ratio Estimation

1 code implementation12 Jun 2020 Masahiro Kato, Takeshi Teshima

Density ratio estimation (DRE) is at the core of various machine learning tasks such as anomaly detection and domain adaptation.

Anomaly Detection Density Ratio Estimation +2

Off-Policy Evaluation and Learning for External Validity under a Covariate Shift

1 code implementation NeurIPS 2020 Masahiro Kato, Masatoshi Uehara, Shota Yasui

Then, we propose doubly robust and efficient estimators for OPE and OPL under a covariate shift by using a nonparametric estimator of the density ratio between the historical and evaluation data distributions.

Off-policy evaluation

Rate-optimal Bayesian Simple Regret in Best Arm Identification

1 code implementation18 Nov 2021 Junpei Komiyama, Kaito Ariu, Masahiro Kato, Chao Qin

We consider best arm identification in the multi-armed bandit problem.

A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees

1 code implementation31 Jan 2024 Toshinori Kitamura, Tadashi Kozuno, Masahiro Kato, Yuki Ichihara, Soichiro Nishimori, Akiyoshi Sannai, Sho Sonoda, Wataru Kumagai, Yutaka Matsuo

We study a primal-dual reinforcement learning (RL) algorithm for the online constrained Markov decision processes (CMDP) problem, wherein the agent explores an optimal policy that maximizes return while satisfying constraints.

Reinforcement Learning (RL)

Alternate Estimation of a Classifier and the Class-Prior from Positive and Unlabeled Data

no code implementations15 Sep 2018 Masahiro Kato, Liyuan Xu, Gang Niu, Masashi Sugiyama

In this paper, we propose a novel unified approach to estimating the class-prior and training a classifier alternately.

Learning from Positive and Unlabeled Data with a Selection Bias

no code implementations ICLR 2019 Masahiro Kato, Takeshi Teshima, Junya Honda

However, this assumption is unrealistic in many instances of PU learning because it fails to capture the existence of a selection bias in the labeling process.

Selection bias

Model Specification Test with Unlabeled Data: Approach from Covariate Shift

no code implementations2 Nov 2019 Masahiro Kato, Hikaru Kawarazaki

By applying the proposed method, we can obtain a model that predicts the label for the unlabeled test data well without losing the interpretability of the model.

regression

Efficient Adaptive Experimental Design for Average Treatment Effect Estimation

no code implementations13 Feb 2020 Masahiro Kato, Takuya Ishihara, Junya Honda, Yusuke Narita

In adaptive experimental design, the experimenter is allowed to change the probability of assigning a treatment using past observations for estimating the ATE efficiently.

Experimental Design

Confidence Interval for Off-Policy Evaluation from Dependent Samples via Bandit Algorithm: Approach from Standardized Martingales

no code implementations12 Jun 2020 Masahiro Kato

The goal of OPE is to evaluate a new policy using historical data obtained from behavior policies generated by the bandit algorithm.

Off-policy evaluation

Learning Classifiers under Delayed Feedback with a Time Window Assumption

no code implementations28 Sep 2020 Masahiro Kato, Shota Yasui

We consider training a binary classifier under delayed feedback (\emph{DF learning}).

Selection bias

Mean-Variance Efficient Reinforcement Learning by Expected Quadratic Utility Maximization

no code implementations3 Oct 2020 Masahiro Kato, Kei Nakagawa, Kenshi Abe, Tetsuro Morimura

To achieve this purpose, we train an agent to maximize the expected quadratic utility function, a common objective of risk management in finance and economics.

Decision Making Decision Making Under Uncertainty +3

ATRO: Adversarial Training with a Rejection Option

no code implementations24 Oct 2020 Masahiro Kato, Zhenghang Cui, Yoshihiro Fukuhara

In this paper, in order to acquire a more reliable classifier against adversarial attacks, we propose the method of Adversarial Training with a Rejection Option (ATRO).

Off-Policy Evaluation of Bandit Algorithm from Dependent Samples under Batch Update Policy

no code implementations23 Oct 2020 Masahiro Kato, Yusuke Kaneko

The goal of off-policy evaluation (OPE) is to evaluate a new policy using historical data obtained via a behavior policy.

Off-policy evaluation

Density-Ratio Based Personalised Ranking from Implicit Feedback

no code implementations19 Jan 2021 Riku Togashi, Masahiro Kato, Mayu Otani, Shin'ichi Satoh

Learning from implicit user feedback is challenging as we can only observe positive samples but never access negative ones.

Density Ratio Estimation

Adaptive Doubly Robust Estimator from Non-stationary Logging Policy under a Convergence of Average Probability

no code implementations17 Feb 2021 Masahiro Kato

To mitigate this limitation, we propose another assumption that the average logging policy converges to a time-invariant function and show the doubly robust (DR) estimator's asymptotic normality.

Counterfactual Inference

Scalable Personalised Item Ranking through Parametric Density Estimation

no code implementations11 May 2021 Riku Togashi, Masahiro Kato, Mayu Otani, Tetsuya Sakai, Shin'ichi Satoh

However, such methods have two main drawbacks particularly in large-scale applications; (1) the pairwise approach is severely inefficient due to the quadratic computational cost; and (2) even recent model-based samplers (e. g. IRGAN) cannot achieve practical efficiency due to the training of an extra model.

Density Estimation Learning-To-Rank

The Role of Contextual Information in Best Arm Identification

1 code implementation26 Jun 2021 Masahiro Kato, Kaito Ariu

We demonstrate that contextual information can be used to improve the efficiency of the identification of the best marginalized mean reward compared with the results of Garivier & Kaufmann (2016).

Learning Causal Models from Conditional Moment Restrictions by Importance Weighting

no code implementations3 Aug 2021 Masahiro Kato, Masaaki Imaizumi, Kenichiro McAlinn, Haruo Kakehi, Shota Yasui

To address this issue, we propose a method that transforms conditional moment restrictions to unconditional moment restrictions through importance weighting, using a conditional density ratio estimator.

Causal Inference

Policy Choice and Best Arm Identification: Asymptotic Analysis of Exploration Sampling

no code implementations16 Sep 2021 Kaito Ariu, Masahiro Kato, Junpei Komiyama, Kenichiro McAlinn, Chao Qin

We consider the "policy choice" problem -- otherwise known as best arm identification in the bandit literature -- proposed by Kasy and Sautmann (2021) for adaptive experimental design.

Decision Making Experimental Design

Learning with Protection: Rejection of Suspicious Samples under Adversarial Environment

no code implementations25 Sep 2019 Masahiro Kato, Yoshihiro Fukuhara, Hirokatsu Kataoka, Shigeo Morishima

Our main idea is to apply a framework of learning with rejection and adversarial examples to assist in the decision making for such suspicious samples.

BIG-bench Machine Learning Binary Classification +3

Policy Gradient with Expected Quadratic Utility Maximization: A New Mean-Variance Approach in Reinforcement Learning

no code implementations28 Sep 2020 Masahiro Kato, Kei Nakagawa

In this paper, we suggest expected quadratic utility maximization (EQUM) as a new framework for policy gradient style reinforcement learning (RL) algorithms with mean-variance control.

Decision Making Management +1

Optimal Best Arm Identification in Two-Armed Bandits with a Fixed Budget under a Small Gap

no code implementations12 Jan 2022 Masahiro Kato, Kaito Ariu, Masaaki Imaizumi, Masahiro Nomura, Chao Qin

We show that a strategy following the Neyman allocation rule (Neyman, 1934) is asymptotically optimal when the gap between the expected rewards is small.

Causal Inference

Unified Perspective on Probability Divergence via Maximum Likelihood Density Ratio Estimation: Bridging KL-Divergence and Integral Probability Metrics

no code implementations31 Jan 2022 Masahiro Kato, Masaaki Imaizumi, Kentaro Minami

This paper provides a unified perspective for the Kullback-Leibler (KL)-divergence and the integral probability metrics (IPMs) from the perspective of maximum likelihood density-ratio estimation (DRE).

Density Ratio Estimation

Benign-Overfitting in Conditional Average Treatment Effect Prediction with Linear Regression

no code implementations10 Feb 2022 Masahiro Kato, Masaaki Imaizumi

We study the benign overfitting theory in the prediction of the conditional average treatment effect (CATE), with linear regression models.

Causal Inference regression

Bayesian Spatial Predictive Synthesis

no code implementations10 Mar 2022 Danielle Cabel, Shonosuke Sugasawa, Masahiro Kato, Kosaku Takanashi, Kenichiro McAlinn

Spatial data are characterized by their spatial dependence, which is often complex, non-linear, and difficult to capture with a single model.

Model Selection Uncertainty Quantification +1

Best Arm Identification with Contextual Information under a Small Gap

no code implementations15 Sep 2022 Masahiro Kato, Masaaki Imaizumi, Takuya Ishihara, Toru Kitagawa

We then develop the ``Random Sampling (RS)-Augmented Inverse Probability weighting (AIPW) strategy,'' which is asymptotically optimal in the sense that the probability of misidentification under the strategy matches the lower bound when the budget goes to infinity in the small-gap regime.

Asymptotically Optimal Fixed-Budget Best Arm Identification with Variance-Dependent Bounds

no code implementations6 Feb 2023 Masahiro Kato, Masaaki Imaizumi, Takuya Ishihara, Toru Kitagawa

We evaluate the decision based on the expected simple regret, which is the difference between the expected outcomes of the best arm and the recommended arm.

Automatic Debiased Learning from Positive, Unlabeled, and Exposure Data

no code implementations8 Mar 2023 Masahiro Kato, Shuting Wu, Kodai Kureishi, Shota Yasui

Therefore, the positive labels that we observe are a combination of both the exposure and the labeling, which creates a selection bias problem for the observed positive samples.

Binary Classification Recommendation Systems +1

Synthetic Control Methods by Density Matching under Implicit Endogeneity

no code implementations20 Jul 2023 Masahiro Kato, Akari Ohda, Masaaki Imaizumi, Kenichiro McAlinn

In this paper, we first point out that existing SCMs suffer from an implicit endogeneity problem, which is the correlation between the outcomes of untreated units and the error term in the model of a counterfactual outcome.

Causal Inference counterfactual

CATE Lasso: Conditional Average Treatment Effect Estimation with High-Dimensional Linear Regression

no code implementations25 Oct 2023 Masahiro Kato, Masaaki Imaizumi

This study assumes two linear regression models between a potential outcome and covariates of the two treatments and defines CATEs as a difference between the linear regression models.

Causal Inference regression

Robust Covariate Shift Adaptation for Density-Ratio Estimation

no code implementations25 Oct 2023 Masahiro Kato

For this problem, existing studies have proposed covariate shift adaptation via importance weighting using the density ratio.

Density Ratio Estimation regression

Worst-Case Optimal Multi-Armed Gaussian Best Arm Identification with a Fixed Budget

no code implementations30 Oct 2023 Masahiro Kato

Because available information is limited in actual experiments, we develop a lower bound that is valid under the unknown means and the unknown choice of the best arm, which are referred to as the worst-case lower bound.

Decision Making Experimental Design

Locally Optimal Fixed-Budget Best Arm Identification in Two-Armed Gaussian Bandits with Unknown Variances

no code implementations20 Dec 2023 Masahiro Kato

They also propose a strategy, assuming that the variances of rewards are known, and show that it is asymptotically optimal in the sense that its probability of misidentification matches the lower bound as the budget approaches infinity.

Best-of-Both-Worlds Linear Contextual Bandits

no code implementations27 Dec 2023 Masahiro Kato, Shinji Ito

The goal of this study is to develop a strategy that is effective in both stochastic and adversarial environments, with theoretical guarantees.

Multi-Armed Bandits

Adaptive Experimental Design for Policy Learning

no code implementations8 Jan 2024 Masahiro Kato, Kyohei Okumura, Takuya Ishihara, Toru Kitagawa

Setting the worst-case expected regret as the performance criterion of adaptive sampling and recommended policies, we derive its asymptotic lower bounds, and propose a strategy, Adaptive Sampling-Policy Learning strategy (PLAS), whose leading factor of the regret upper bound aligns with the lower bound as the size of experimental units increases.

counterfactual Experimental Design

LC-Tsalis-INF: Generalized Best-of-Both-Worlds Linear Contextual Bandits

no code implementations5 Mar 2024 Masahiro Kato, Shinji Ito

For this issue, this study proposes an algorithm whose regret satisfies $O(\log(T))$ in the setting when the suboptimality gap is lower-bounded.

Multi-Armed Bandits

Active Adaptive Experimental Design for Treatment Effect Estimation with Covariate Choices

no code implementations6 Mar 2024 Masahiro Kato, Akihiro Oga, Wataru Komatsubara, Ryo Inokuchi

To design an adaptive experiment, we first derive the efficient covariate density and propensity score that minimizes the semiparametric efficiency bound, a lower bound for the asymptotic variance given a fixed covariate density and a fixed propensity score.

Experimental Design

Triple/Debiased Lasso for Statistical Inference of Conditional Average Treatment Effects

no code implementations5 Mar 2024 Masahiro Kato

In high-dimensional linear regression, one typical approach is to assume sparsity.

regression

Cannot find the paper you are looking for? You can Submit a new open access paper.