no code implementations • 6 Feb 2024 • Mengfan Xu, Diego Klabjan
We study a robust multi-agent multi-armed bandit problem where multiple clients or participants are distributed on a fully decentralized blockchain, with the possibility of some being malicious.
no code implementations • 15 Aug 2023 • Mengfan Xu, Diego Klabjan
Multi-armed Bandit motivates methods with provable upper bounds on regret and also the counterpart lower bounds have been extensively studied in this context.
no code implementations • 1 Dec 2022 • Mengfan Xu, Diego Klabjan
We study Pareto optimality in multi-objective multi-armed bandit by providing a formulation of adversarial multi-objective multi-armed bandit and defining its Pareto regrets that can be applied to both stochastic and adversarial settings.
no code implementations • 21 Mar 2022 • Shu Wan, Chen Zheng, Zhonggen Sun, Mengfan Xu, Xiaoqing Yang, Hongtu Zhu, Jiecheng Guo
We show the effectiveness of GCF by deriving the asymptotic property of the estimator and comparing it to popular uplift modeling methods on both synthetic and real-world datasets.
no code implementations • 29 Sep 2021 • Shu Wan, Chen Zheng, Zhonggen Sun, Mengfan Xu, Xiaoqing Yang, Jiecheng Guo, Hongtu Zhu
Heterogeneous treatment effect (HTE) estimation with continuous treatment is essential in multiple disciplines, such as the online marketplace and pharmaceutical industry.
no code implementations • 20 Sep 2020 • Mengfan Xu, Diego Klabjan
We propose a new algorithm, namely EXP4. P, by modifying EXP4 and establish its upper bound of regret in both bounded and unbounded sub-Gaussian contextual bandit settings.