Search Results for author: Mengfan Xu

Found 6 papers, 0 papers with code

Decentralized Blockchain-based Robust Multi-agent Multi-armed Bandit

no code implementations6 Feb 2024 Mengfan Xu, Diego Klabjan

We study a robust multi-agent multi-armed bandit problem where multiple clients or participants are distributed on a fully decentralized blockchain, with the possibility of some being malicious.

Federated Learning

Regret Lower Bounds in Multi-agent Multi-armed Bandit

no code implementations15 Aug 2023 Mengfan Xu, Diego Klabjan

Multi-armed Bandit motivates methods with provable upper bounds on regret and also the counterpart lower bounds have been extensively studied in this context.

Pareto Regret Analyses in Multi-objective Multi-armed Bandit

no code implementations1 Dec 2022 Mengfan Xu, Diego Klabjan

We study Pareto optimality in multi-objective multi-armed bandit by providing a formulation of adversarial multi-objective multi-armed bandit and defining its Pareto regrets that can be applied to both stochastic and adversarial settings.

Adversarial Attack

GCF: Generalized Causal Forest for Heterogeneous Treatment Effect Estimation in Online Marketplace

no code implementations21 Mar 2022 Shu Wan, Chen Zheng, Zhonggen Sun, Mengfan Xu, Xiaoqing Yang, Hongtu Zhu, Jiecheng Guo

We show the effectiveness of GCF by deriving the asymptotic property of the estimator and comparing it to popular uplift modeling methods on both synthetic and real-world datasets.

Causal Inference Decision Making

GCF: Generalized Causal Forest for Heterogeneous Treatment Effect Estimation Using Nonparametric Methods

no code implementations29 Sep 2021 Shu Wan, Chen Zheng, Zhonggen Sun, Mengfan Xu, Xiaoqing Yang, Jiecheng Guo, Hongtu Zhu

Heterogeneous treatment effect (HTE) estimation with continuous treatment is essential in multiple disciplines, such as the online marketplace and pharmaceutical industry.

Regret Bounds and Reinforcement Learning Exploration of EXP-based Algorithms

no code implementations20 Sep 2020 Mengfan Xu, Diego Klabjan

We propose a new algorithm, namely EXP4. P, by modifying EXP4 and establish its upper bound of regret in both bounded and unbounded sub-Gaussian contextual bandit settings.

reinforcement-learning Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.