no code implementations • 20 Aug 2024 • Tatsuhiro Shimizu, Koichi Tanaka, Ren Kishimoto, Haruka Kiyohara, Masahiro Nomura, Yuta Saito
We explore off-policy evaluation and learning (OPE/L) in contextual combinatorial bandits (CCB), where a policy selects a subset in the action space.
2 code implementations • 7 Aug 2023 • Tatsuhiro Shimizu
We study how to extend the use of the diffusion model to answer the causal question from the observational data under the existence of unmeasured confounders.
1 code implementation • 7 Aug 2023 • Tatsuhiro Shimizu, Laura Forastiere
To overcome these limitations, Marginalized Inverse Propensity Scoring (MIPS) was proposed to mitigate the estimator's variance via embeddings of an action.