Search Results for author: Yangyi Lu

Found 7 papers, 0 papers with code

Offline Policy Evaluation and Optimization under Confounding

no code implementations29 Nov 2022 Chinmaya Kausik, Yangyi Lu, Kevin Tan, Maggie Makar, Yixin Wang, Ambuj Tewari

Evaluating and optimizing policies in the presence of unobserved confounders is a problem of growing interest in offline reinforcement learning.

Offline RL Off-policy evaluation

Bandit Algorithms for Precision Medicine

no code implementations10 Aug 2021 Yangyi Lu, Ziping Xu, Ambuj Tewari

However, the modern precision medicine movement has been enabled by a confluence of events: scientific advances in fields such as genetics and pharmacology, technological advances in mobile devices and wearable sensors, and methodological advances in computing and data sciences.

Causal Bandits with Unknown Graph Structure

no code implementations NeurIPS 2021 Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

In causal bandit problems, the action set consists of interventions on variables of a causal graph.

Causal Markov Decision Processes: Learning Good Interventions Efficiently

no code implementations15 Feb 2021 Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

We introduce causal Markov Decision Processes (C-MDPs), a new formalism for sequential decision making which combines the standard MDP formulation with causal structures over state transition and reward functions.

Decision Making Marketing

Low-Rank Generalized Linear Bandit Problems

no code implementations4 Jun 2020 Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

To get around the computational intractability of covering based approaches, we propose an efficient algorithm by extending the "Explore-Subspace-Then-Refine" algorithm of~\citet{jun2019bilinear}.

Regret Analysis of Bandit Problems with Causal Background Knowledge

no code implementations11 Oct 2019 Yangyi Lu, Amirhossein Meisami, Ambuj Tewari, Zhenyu Yan

For example, we observe that even with a few hundreds of iterations, the regret of causal algorithms is less than that of standard algorithms by a factor of three.

Thompson Sampling

An Actor-Critic Contextual Bandit Algorithm for Personalized Mobile Health Interventions

no code implementations28 Jun 2017 Huitian Lei, Yangyi Lu, Ambuj Tewari, Susan A. Murphy

Increasing technological sophistication and widespread use of smartphones and wearable devices provide opportunities for innovative and highly personalized health interventions.

Cannot find the paper you are looking for? You can Submit a new open access paper.