Search Results for author: Ren Kishimoto

Found 3 papers, 2 papers with code

Effective Off-Policy Evaluation and Learning in Contextual Combinatorial Bandits

no code implementations20 Aug 2024 Tatsuhiro Shimizu, Koichi Tanaka, Ren Kishimoto, Haruka Kiyohara, Masahiro Nomura, Yuta Saito

We explore off-policy evaluation and learning (OPE/L) in contextual combinatorial bandits (CCB), where a policy selects a subset in the action space.

Off-policy evaluation Recommendation Systems +1

Towards Assessing and Benchmarking Risk-Return Tradeoff of Off-Policy Evaluation

1 code implementation30 Nov 2023 Haruka Kiyohara, Ren Kishimoto, Kosuke Kawakami, Ken Kobayashi, Kazuhide Nakata, Yuta Saito

Existing evaluation metrics for OPE estimators primarily focus on the "accuracy" of OPE or that of downstream policy selection, neglecting risk-return tradeoff in the subsequent online policy deployment.

Benchmarking counterfactual +1

SCOPE-RL: A Python Library for Offline Reinforcement Learning and Off-Policy Evaluation

1 code implementation30 Nov 2023 Haruka Kiyohara, Ren Kishimoto, Kosuke Kawakami, Ken Kobayashi, Kazuhide Nakata, Yuta Saito

This paper introduces SCOPE-RL, a comprehensive open-source Python software designed for offline reinforcement learning (offline RL), off-policy evaluation (OPE), and selection (OPS).

Offline RL Off-policy evaluation

Cannot find the paper you are looking for? You can Submit a new open access paper.