Search Results for author: Yuta Saito

Found 25 papers, 17 papers with code

Scalable and Provably Fair Exposure Control for Large-Scale Recommender Systems

1 code implementation • 22 Feb 2024 • Riku Togashi, Kenshi Abe, Yuta Saito

Typical recommendation and ranking methods aim to optimize the satisfaction of users, but they are often oblivious to their impact on the items (e. g., products, jobs, news, video) and their providers.

Collaborative Filtering Exposure Fairness +1

Paper
Code

POTEC: Off-Policy Learning for Large Action Spaces via Two-Stage Policy Decomposition

no code implementations • 9 Feb 2024 • Yuta Saito, Jihan Yao, Thorsten Joachims

We also show that POTEC provides a strict generalization of policy- and regression-based approaches and their associated assumptions.

regression

Paper
Add Code

Off-Policy Evaluation of Slate Bandit Policies via Optimizing Abstraction

1 code implementation • 3 Feb 2024 • Haruka Kiyohara, Masahiro Nomura, Yuta Saito

The PseudoInverse (PI) estimator has been introduced to mitigate the variance issue by assuming linearity in the reward function, but this can result in significant bias as this assumption is hard-to-verify from observed data and is often substantially violated.

Marketing Multi-Armed Bandits +2

Paper
Code

SCOPE-RL: A Python Library for Offline Reinforcement Learning and Off-Policy Evaluation

1 code implementation • 30 Nov 2023 • Haruka Kiyohara, Ren Kishimoto, Kosuke Kawakami, Ken Kobayashi, Kazuhide Nakata, Yuta Saito

This paper introduces SCOPE-RL, a comprehensive open-source Python software designed for offline reinforcement learning (offline RL), off-policy evaluation (OPE), and selection (OPS).

Offline RL Off-policy evaluation

Paper
Code

Towards Assessing and Benchmarking Risk-Return Tradeoff of Off-Policy Evaluation

1 code implementation • 30 Nov 2023 • Haruka Kiyohara, Ren Kishimoto, Kosuke Kawakami, Ken Kobayashi, Kazuhide Nakata, Yuta Saito

Existing evaluation metrics for OPE estimators primarily focus on the "accuracy" of OPE or that of downstream policy selection, neglecting risk-return tradeoff in the subsequent online policy deployment.

Benchmarking counterfactual +1

Paper
Code

Off-Policy Evaluation of Ranking Policies under Diverse User Behavior

1 code implementation • 26 Jun 2023 • Haruka Kiyohara, Masatoshi Uehara, Yusuke Narita, Nobuyuki Shimizu, Yasuo Yamamoto, Yuta Saito

We show that the resulting estimator, which we call Adaptive IPS (AIPS), can be unbiased under any complex user behavior.

Off-policy evaluation

Paper
Code

Off-Policy Evaluation for Large Action Spaces via Conjunct Effect Modeling

no code implementations • 14 May 2023 • Yuta Saito, Qingyang Ren, Thorsten Joachims

To circumvent this variance issue, we propose a new estimator, called OffCEM, that is based on the conjunct effect model (CEM), a novel decomposition of the causal effect into a cluster effect and a residual effect.

Off-policy evaluation

Paper
Add Code

Policy-Adaptive Estimator Selection for Off-Policy Evaluation

1 code implementation • 25 Nov 2022 • Takuma Udagawa, Haruka Kiyohara, Yusuke Narita, Yuta Saito, Kei Tateno

Although many estimators have been developed, there is no single estimator that dominates the others, because the estimators' accuracy can vary greatly depending on a given OPE task such as the evaluation policy, number of actions, and noise level.

counterfactual Off-policy evaluation

Paper
Code

Fair Ranking as Fair Division: Impact-Based Individual Fairness in Ranking

1 code implementation • 15 Jun 2022 • Yuta Saito, Thorsten Joachims

Our axioms of envy-freeness and dominance over uniform ranking postulate that for a fair ranking policy every item should prefer their own rank allocation over that of any other item, and that no item should be actively disadvantaged by the rankings.

Fairness

Paper
Code

A Real-World Implementation of Unbiased Lift-based Bidding System

no code implementations • 23 Feb 2022 • Daisuke Moriwaki, Yuta Hayakawa, Akira Matsui, Yuta Saito, Isshu Munemasa, Masashi Shibata

Second, thepractical usefulness of lift-based bidding is not widely understood in the online advertising industry due to the lack of a comprehensive investigation of its impact. We here propose a practically-implementable lift-based bidding system that perfectly fits the current billing rules.

Paper
Add Code

Off-Policy Evaluation for Large Action Spaces via Embeddings

3 code implementations • 13 Feb 2022 • Yuta Saito, Thorsten Joachims

Unfortunately, when the number of actions is large, existing OPE estimators -- most of which are based on inverse propensity score weighting -- degrade severely and can suffer from extreme bias and variance.

Multi-Armed Bandits Off-policy evaluation +1

612

Paper
Code

Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model

2 code implementations • 3 Feb 2022 • Haruka Kiyohara, Yuta Saito, Tatsuya Matsuhiro, Yusuke Narita, Nobuyuki Shimizu, Yasuo Yamamoto

We show that the proposed estimator is unbiased in more cases compared to existing estimators that make stronger assumptions.

Multi-Armed Bandits Off-policy evaluation +1

612

Paper
Code

Data-Driven Off-Policy Estimator Selection: An Application in User Marketing on An Online Content Delivery Service

no code implementations • 17 Sep 2021 • Yuta Saito, Takuma Udagawa, Kei Tateno

As proof of concept, we use our procedure to select the best estimator to evaluate coupon treatment policies on a real-world online content delivery service.

Decision Making Marketing +2

Paper
Add Code

Accelerating Offline Reinforcement Learning Application in Real-Time Bidding and Recommendation: Potential Use of Simulation

no code implementations • 17 Sep 2021 • Haruka Kiyohara, Kosuke Kawakami, Yuta Saito

In this position paper, we explore the potential of using simulation to accelerate practical research of offline RL and OPE, particularly in RecSys and RTB.

Decision Making Offline RL +4

Paper
Add Code

Evaluating the Robustness of Off-Policy Evaluation

2 code implementations • 31 Aug 2021 • Yuta Saito, Takuma Udagawa, Haruka Kiyohara, Kazuki Mogi, Yusuke Narita, Kei Tateno

Unfortunately, identifying a reliable estimator from results reported in research papers is often difficult because the current experimental procedure evaluates and compares the estimators' performance on a narrow set of hyperparameters and evaluation policies.

Off-policy evaluation Recommendation Systems

612

Paper
Code

Optimal Off-Policy Evaluation from Multiple Logging Policies

1 code implementation • 21 Oct 2020 • Nathan Kallus, Yuta Saito, Masatoshi Uehara

We study off-policy evaluation (OPE) from multiple logging policies, each generating a dataset of fixed size, i. e., stratified sampling.

Off-policy evaluation

Paper
Code

Multi-Source Unsupervised Hyperparameter Optimization

no code implementations • 28 Sep 2020 • Masahiro Nomura, Yuta Saito

How can we conduct efficient hyperparameter optimization for a completely new task?

Hyperparameter Optimization

Paper
Add Code

Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation

3 code implementations • 17 Aug 2020 • Yuta Saito, Shunsuke Aihara, Megumi Matsutani, Yusuke Narita

Our dataset is unique in that it contains a set of multiple logged bandit datasets collected by running different policies on the same platform.

Off-policy evaluation

612

Paper
Code

Unbiased Lift-based Bidding System

no code implementations • 8 Jul 2020 • Daisuke Moriwaki, Yuta Hayakawa, Isshu Munemasa, Yuta Saito, Akira Matsui

Rather, the bidding strategy that leads to the maximum revenue is a strategy pursuing the performance lift of showing ads to a specific user.

Paper
Add Code

Efficient Hyperparameter Optimization under Multi-Source Covariate Shift

2 code implementations • 18 Jun 2020 • Masahiro Nomura, Yuta Saito

This assumption is, however, often violated in uncertain real-world applications, which motivates the study of learning under covariate shift.

Bayesian Optimization Hyperparameter Optimization

Paper
Code

Towards Resolving Propensity Contradiction in Offline Recommender Learning

1 code implementation • 16 Oct 2019 • Yuta Saito, Masahiro Nomura

We study offline recommender learning from explicit rating feedback in the presence of selection bias.

Selection bias Unsupervised Domain Adaptation

Paper
Code

Dual Learning Algorithm for Delayed Conversions

no code implementations • 4 Oct 2019 • Yuta Saito, Gota Morishita, Shota Yasui

To overcome these challenges, we propose two unbiased estimators: one for CVR prediction and the other for bias estimation.

Paper
Add Code

Counterfactual Cross-Validation: Stable Model Selection Procedure for Causal Inference Models

1 code implementation • ICML 2020 • Yuta Saito, Shota Yasui

We study the model selection problem in conditional average treatment effect (CATE) prediction.

Causal Inference counterfactual +1

Paper
Code

Unbiased Recommender Learning from Missing-Not-At-Random Implicit Feedback

2 code implementations • 9 Sep 2019 • Yuta Saito, Suguru Yaginuma, Yuta Nishino, Hayato Sakata, Kazuhide Nakata

Subsequently, we analyze the variance of the proposed unbiased estimator and further propose a clipped estimator that includes the unbiased estimator as a special case.

Causal Inference Recommendation Systems

Paper
Code

Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback

1 code implementation • 8 Sep 2019 • Yuta Saito

A well-established solution for the problem is using propensity scoring techniques.

Meta-Learning Recommendation Systems +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.