Search Results for author: Yangyi Lu

Found 7 papers, 0 papers with code

Offline Policy Evaluation and Optimization under Confounding

no code implementations • 29 Nov 2022 • Chinmaya Kausik, Yangyi Lu, Kevin Tan, Maggie Makar, Yixin Wang, Ambuj Tewari

Evaluating and optimizing policies in the presence of unobserved confounders is a problem of growing interest in offline reinforcement learning.

Offline RL Off-policy evaluation

Paper
Add Code

Bandit Algorithms for Precision Medicine

no code implementations • 10 Aug 2021 • Yangyi Lu, Ziping Xu, Ambuj Tewari

However, the modern precision medicine movement has been enabled by a confluence of events: scientific advances in fields such as genetics and pharmacology, technological advances in mobile devices and wearable sensors, and methodological advances in computing and data sciences.

Paper
Add Code

Causal Bandits with Unknown Graph Structure

no code implementations • NeurIPS 2021 • Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

In causal bandit problems, the action set consists of interventions on variables of a causal graph.

Paper
Add Code

Causal Markov Decision Processes: Learning Good Interventions Efficiently

no code implementations • 15 Feb 2021 • Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

We introduce causal Markov Decision Processes (C-MDPs), a new formalism for sequential decision making which combines the standard MDP formulation with causal structures over state transition and reward functions.

Decision Making Marketing

Paper
Add Code

Low-Rank Generalized Linear Bandit Problems

no code implementations • 4 Jun 2020 • Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

To get around the computational intractability of covering based approaches, we propose an efficient algorithm by extending the "Explore-Subspace-Then-Refine" algorithm of~\citet{jun2019bilinear}.

Paper
Add Code

Regret Analysis of Bandit Problems with Causal Background Knowledge

no code implementations • 11 Oct 2019 • Yangyi Lu, Amirhossein Meisami, Ambuj Tewari, Zhenyu Yan

For example, we observe that even with a few hundreds of iterations, the regret of causal algorithms is less than that of standard algorithms by a factor of three.

Thompson Sampling

Paper
Add Code

An Actor-Critic Contextual Bandit Algorithm for Personalized Mobile Health Interventions

no code implementations • 28 Jun 2017 • Huitian Lei, Yangyi Lu, Ambuj Tewari, Susan A. Murphy

Increasing technological sophistication and widespread use of smartphones and wearable devices provide opportunities for innovative and highly personalized health interventions.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.