Search Results for author: Nathan Kallus

Found 68 papers, 26 papers with code

Double Reinforcement Learning for Efficient and Robust Off-Policy Evaluation

no code implementations ICML 2020 Nathan Kallus, Masatoshi Uehara

Off-policy evaluation (OPE) in reinforcement learning allows one to evaluate novel decision policies without needing to conduct exploration, which is often costly or otherwise infeasible.

reinforcement-learning

Robust and Agnostic Learning of Conditional Distributional Treatment Effects

1 code implementation23 May 2022 Nathan Kallus, Miruna Oprescu

The conditional average treatment effect (CATE) is the best point prediction of individual causal effects given individual baseline covariates and can help personalize treatments.

What's the Harm? Sharp Bounds on the Fraction Negatively Affected by Treatment

no code implementations20 May 2022 Nathan Kallus

), whether exposed to the standard experience A or a new one B, hypothetically it could be because the change affects no one, because the change positively affects half the user population to go from no-click to click while negatively affecting the other half, or something in between.

Causal Inference Fairness

Estimating Structural Disparities for Face Models

no code implementations13 Apr 2022 Shervin Ardeshir, Cristina Segalin, Nathan Kallus

Performance of the model for each group is calculated by comparing $\hat{y}$ and $y$ for the datapoints within a specific group, and as a result, disparity of performance across the different groups can be calculated.

Face Recognition

Doubly Robust Distributionally Robust Off-Policy Evaluation and Learning

no code implementations19 Feb 2022 Nathan Kallus, Xiaojie Mao, Kaiwen Wang, Zhengyuan Zhou

Notably, thanks to a localization technique, LDR$^2$OPE only requires fitting a small number of regressions, just like DR methods for vanilla OPE.

Long-term Causal Inference Under Persistent Confounding via Data Combination

no code implementations15 Feb 2022 Guido Imbens, Nathan Kallus, Xiaojie Mao, Yuhao Wang

In this paper, we uniquely tackle the challenge of persistent unmeasured confounders, i. e., some unmeasured confounders that can simultaneously affect the treatment, short-term outcomes and the long-term outcome, noting that they invalidate identification strategies in previous literature.

Causal Inference

Treatment Effect Risk: Bounds and Inference

no code implementations15 Jan 2022 Nathan Kallus

This is challenging even in randomized experiments as it requires understanding the distribution of the unknown CATE function, which can be very complex if we use rich covariates so as to best control for heterogeneity.

Doubly-Valid/Doubly-Sharp Sensitivity Analysis for Causal Inference with Unmeasured Confounding

no code implementations21 Dec 2021 Jacob Dorn, Kevin Guo, Nathan Kallus

We study the problem of constructing bounds on the average treatment effect in the presence of unobserved confounding under the marginal sensitivity model of Tan (2006).

Causal Inference

Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in Partially Observed Markov Decision Processes

no code implementations28 Oct 2021 Andrew Bennett, Nathan Kallus

To answer these, we extend the framework of proximal causal inference to our POMDP setting, providing a variety of settings where identification is made possible by the existence of so-called bridge functions.

Causal Inference reinforcement-learning

Stateful Offline Contextual Policy Evaluation and Learning

no code implementations19 Oct 2021 Nathan Kallus, Angela Zhou

We study off-policy evaluation and learning from sequential data in a structured class of Markov decision processes that arise from repeated interactions with an exogenous sequence of arrivals with contexts, which generate unknown individual-level responses to agent actions.

Multi-Armed Bandits

Residual Overfit Method of Exploration

no code implementations6 Oct 2021 James McInerney, Nathan Kallus

The approach, which we term the residual overfit method of exploration (ROME), drives exploration towards actions where the overfit model exhibits the most overfitting compared to the tuned model.

Control Variates for Slate Off-Policy Evaluation

1 code implementation NeurIPS 2021 Nikos Vlassis, Ashok Chandrashekar, Fernando Amat Gil, Nathan Kallus

We study the problem of off-policy evaluation from batched contextual bandit data with multidimensional actions, often termed slates.

Recommendation Systems

Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning

no code implementations NeurIPS 2021 Aurélien Bibaut, Antoine Chambaz, Maria Dimakopoulou, Nathan Kallus, Mark van der Laan

Empirical risk minimization (ERM) is the workhorse of machine learning, whether for classification and regression or for off-policy policy learning, but its model-agnostic guarantees can fail when we use adaptively collected data, such as the result of running a contextual bandit algorithm.

Post-Contextual-Bandit Inference

no code implementations NeurIPS 2021 Aurélien Bibaut, Antoine Chambaz, Maria Dimakopoulou, Nathan Kallus, Mark van der Laan

The adaptive nature of the data collected by contextual bandit algorithms, however, makes this difficult: standard estimators are no longer asymptotically normally distributed and classic confidence intervals fail to provide correct coverage.

Causal Inference Under Unmeasured Confounding With Negative Controls: A Minimax Learning Approach

no code implementations25 Mar 2021 Nathan Kallus, Xiaojie Mao, Masatoshi Uehara

In this paper, we tackle the primary challenge to causal inference using negative controls: the identification and estimation of these bridge functions.

Causal Inference

Finite Sample Analysis of Minimax Offline Reinforcement Learning: Completeness, Fast Rates and First-Order Efficiency

no code implementations5 Feb 2021 Masatoshi Uehara, Masaaki Imaizumi, Nan Jiang, Nathan Kallus, Wen Sun, Tengyang Xie

We offer a theoretical characterization of off-policy evaluation (OPE) in reinforcement learning using function approximation for marginal importance weights and $q$-functions when these are estimated using recent minimax methods.

reinforcement-learning

Fast Rates for the Regret of Offline Reinforcement Learning

no code implementations31 Jan 2021 Yichun Hu, Nathan Kallus, Masatoshi Uehara

We study the regret of reinforcement learning from offline data generated by a fixed behavior policy in an infinite-horizon discounted Markov decision process (MDP).

Decision Making reinforcement-learning

Fairness, Welfare, and Equity in Personalized Pricing

no code implementations21 Dec 2020 Nathan Kallus, Angela Zhou

These different application areas may lead to different concerns around fairness, welfare, and equity on different objectives: price burdens on consumers, price envy, firm revenue, access to a good, equal access, and distributional consequences when the good in question further impacts downstream outcomes of interest.

Fairness

The Variational Method of Moments

1 code implementation17 Dec 2020 Andrew Bennett, Nathan Kallus

The conditional moment problem is a powerful formulation for describing structural causal parameters in terms of observables, a prominent example being instrumental variable regression.

Rejoinder: New Objectives for Policy Learning

no code implementations5 Dec 2020 Nathan Kallus

I provide a rejoinder for discussion of "More Efficient Policy Learning via Optimal Retargeting" to appear in the Journal of the American Statistical Association with discussion by Oliver Dukes and Stijn Vansteelandt; Sijia Li, Xiudi Li, and Alex Luedtkeand; and Muxuan Liang and Yingqi Zhao.

Fast Rates for Contextual Linear Optimization

no code implementations5 Nov 2020 Yichun Hu, Nathan Kallus, Xiaojie Mao

While one may use off-the-shelf machine learning methods to separately learn a predictive model and plug it in, a variety of recent methods instead integrate estimation and optimization by fitting the model to directly optimize downstream decision performance.

Decision Making

Optimal Off-Policy Evaluation from Multiple Logging Policies

1 code implementation21 Oct 2020 Nathan Kallus, Yuta Saito, Masatoshi Uehara

We study off-policy evaluation (OPE) from multiple logging policies, each generating a dataset of fixed size, i. e., stratified sampling.

Stochastic Optimization Forests

1 code implementation17 Aug 2020 Nathan Kallus, Xiaojie Mao

We study contextual stochastic optimization problems, where we leverage rich auxiliary observations (e. g., product characteristics) to improve decision making with uncertain variables (e. g., demand).

Decision Making Stochastic Optimization

Off-policy Evaluation in Infinite-Horizon Reinforcement Learning with Latent Confounders

no code implementations27 Jul 2020 Andrew Bennett, Nathan Kallus, Lihong Li, Ali Mousavi

We study an OPE problem in an infinite-horizon, ergodic Markov decision process with unobserved confounders, where states and actions can act as proxies for the unobserved confounders.

reinforcement-learning

Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies

1 code implementation NeurIPS 2020 Nathan Kallus, Masatoshi Uehara

Targeting deterministic policies, for which action is a deterministic function of state, is crucial since optimal policies are always deterministic (up to ties).

Efficient Evaluation of Natural Stochastic Policies in Offline Reinforcement Learning

no code implementations6 Jun 2020 Nathan Kallus, Masatoshi Uehara

Compared with the classic case of a pre-specified evaluation policy, when evaluating natural stochastic policies, the efficiency bound, which measures the best-achievable estimation error, is inflated since the evaluation policy itself is unknown.

reinforcement-learning

DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret

no code implementations6 May 2020 Yichun Hu, Nathan Kallus

While existing literature mostly focuses on estimating the optimal DTR from offline data such as from sequentially randomized trials, we study the problem of developing the optimal DTR in an online manner, where the interaction with each individual affect both our cumulative reward and our data collection for future learning.

On the Optimality of Randomization in Experimental Design: How to Randomize for Minimax Variance and Design-Based Inference

no code implementations6 May 2020 Nathan Kallus

When this set is permutation symmetric, the optimal design is complete randomization, and using a single partition (i. e., the design that only randomizes the treatment labels for each side of the partition) has minimax risk larger by a factor of $n-1$.

Experimental Design

On the role of surrogates in the efficient estimation of treatment effects with limited outcome data

no code implementations27 Mar 2020 Nathan Kallus, Xiaojie Mao

We study the problem of estimating treatment effects when the outcome of primary interest (e. g., long-term health status) is only seldom observed but abundant surrogate observations (e. g., short-term health outcomes) are available.

Efficient Policy Learning from Surrogate-Loss Classification Reductions

1 code implementation ICML 2020 Andrew Bennett, Nathan Kallus

We show that, under a correct specification assumption, the weighted classification formulation need not be efficient for policy parameters.

Classification General Classification

Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement Learning

no code implementations NeurIPS 2020 Nathan Kallus, Angela Zhou

We develop a robust approach that estimates sharp bounds on the (unidentifiable) value of a given policy in an infinite-horizon problem given data from another policy with unobserved confounding, subject to a sensitivity model.

reinforcement-learning

Statistically Efficient Off-Policy Policy Gradients

no code implementations ICML 2020 Nathan Kallus, Masatoshi Uehara

Policy gradient methods in reinforcement learning update policy parameters by taking steps in the direction of an estimated gradient of policy value.

Policy Gradient Methods reinforcement-learning

Generalization Bounds and Representation Learning for Estimation of Potential Outcomes and Causal Effects

1 code implementation21 Jan 2020 Fredrik D. Johansson, Uri Shalit, Nathan Kallus, David Sontag

Practitioners in diverse fields such as healthcare, economics and education are eager to apply machine learning to improve decision making.

Decision Making Generalization Bounds +2

Localized Debiased Machine Learning: Efficient Inference on Quantile Treatment Effects and Beyond

1 code implementation30 Dec 2019 Nathan Kallus, Xiaojie Mao, Masatoshi Uehara

We instead propose localized debiased machine learning (LDML), a new data-splitting approach that avoids this burdensome step and needs only estimate the nuisances at a single initial rough guess for the parameter.

Causal Inference

Assessing Disparate Impact of Personalized Interventions: Identifiability and Bounds

1 code implementation NeurIPS 2019 Nathan Kallus, Angela Zhou

Personalized interventions in social services, education, and healthcare leverage individual-level causal effect predictions in order to give the best treatment to each individual or to prioritize program interventions for the individuals most likely to benefit.

Fairness

Kernel Optimal Orthogonality Weighting: A Balancing Approach to Estimating Effects of Continuous Treatments

no code implementations26 Oct 2019 Nathan Kallus, Michele Santacatterina

In this paper, we propose Kernel Optimal Orthogonality Weighting (KOOW), a convex optimization-based method, for estimating the effects of continuous treatments.

Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning

no code implementations12 Sep 2019 Nathan Kallus, Masatoshi Uehara

This precisely characterizes the curse of horizon: in time-variant processes, OPE is only feasible in the near-on-policy setting, where behavior and target policies are sufficiently similar.

reinforcement-learning

Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes

no code implementations5 Sep 2019 Yichun Hu, Nathan Kallus, Xiaojie Mao

We study a nonparametric contextual bandit problem where the expected reward functions belong to a H\"older class with smoothness parameter $\beta$.

Multi-Armed Bandits

Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes

1 code implementation22 Aug 2019 Nathan Kallus, Masatoshi Uehara

Off-policy evaluation (OPE) in reinforcement learning allows one to evaluate novel decision policies without needing to conduct exploration, which is often costly or otherwise infeasible.

reinforcement-learning

Optimal Estimation of Generalized Average Treatment Effects using Kernel Optimal Matching

1 code implementation13 Aug 2019 Nathan Kallus, Michele Santacatterina

In causal inference, a variety of causal effect estimands have been studied, including the sample, uncensored, target, conditional, optimal subpopulation, and optimal weighted average treatment effects.

Causal Inference

Policy Evaluation with Latent Confounders via Optimal Balance

1 code implementation NeurIPS 2019 Andrew Bennett, Nathan Kallus

We study the question of policy evaluation when we instead have proxies for the latent confounders and develop an importance weighting method that avoids fitting a latent outcome regression model.

More Efficient Policy Learning via Optimal Retargeting

1 code implementation20 Jun 2019 Nathan Kallus

Policy learning can be used to extract individualized treatment regimes from observational data in healthcare, civics, e-commerce, and beyond.

Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning

1 code implementation NeurIPS 2019 Nathan Kallus, Masatoshi Uehara

We propose new estimators for OPE based on empirical likelihood that are always more efficient than IS, SNIS, and DR and satisfy the same stability and boundedness properties as SNIS.

Multi-Armed Bandits reinforcement-learning

Assessing Disparate Impacts of Personalized Interventions: Identifiability and Bounds

1 code implementation4 Jun 2019 Nathan Kallus, Angela Zhou

Personalized interventions in social services, education, and healthcare leverage individual-level causal effect predictions in order to give the best treatment to each individual or to prioritize program interventions for the individuals most likely to benefit.

Fairness

Assessing Algorithmic Fairness with Unobserved Protected Class Using Data Combination

1 code implementation1 Jun 2019 Nathan Kallus, Xiaojie Mao, Angela Zhou

In this paper we study a fundamental challenge to assessing disparate impacts in practice: protected class membership is often not observed in the data.

Fairness

Data-Pooling in Stochastic Optimization

1 code implementation1 Jun 2019 Vishal Gupta, Nathan Kallus

This intuition further suggests that data-pooling offers the most benefits when there are many problems, each of which has a small amount of relevant data.

Stochastic Optimization

Deep Generalized Method of Moments for Instrumental Variable Analysis

1 code implementation NeurIPS 2019 Andrew Bennett, Nathan Kallus, Tobias Schnabel

Instrumental variable analysis is a powerful tool for estimating causal effects when randomization or full control of confounders is not possible.

Model Selection

The Fairness of Risk Scores Beyond Classification: Bipartite Ranking and the xAUC Metric

1 code implementation NeurIPS 2019 Nathan Kallus, Angela Zhou

To better account for this, in this paper, we investigate the fairness of predictive risk scores from the point of view of a bipartite ranking task, where one seeks to rank positive examples higher than negative ones.

Fairness General Classification

Classifying Treatment Responders Under Causal Effect Monotonicity

no code implementations14 Feb 2019 Nathan Kallus

In the context of individual-level causal inference, we study the problem of predicting whether someone will respond or not to a treatment based on their features and past examples of features, treatment indicator (e. g., drug/no drug), and a binary outcome (e. g., recovery from disease).

Causal Inference Classification +1

Fairness Under Unawareness: Assessing Disparity When Protected Class Is Unobserved

1 code implementation27 Nov 2018 Jiahao Chen, Nathan Kallus, Xiaojie Mao, Geoffry Svacha, Madeleine Udell

We also propose an alternative weighted estimator that uses soft classification, and show that its bias arises simply from the conditional covariance of the outcome with the true class membership.

Decision Making Fairness +1

More robust estimation of sample average treatment effects using Kernel Optimal Matching in an observational study of spine surgical interventions

1 code implementation10 Nov 2018 Nathan Kallus, Brenton Pennicooke, Michele Santacatterina

Inverse probability of treatment weighting (IPTW), which has been used to estimate sample average treatment effects (SATE) using observational data, tenuously relies on the positivity assumption and the correct specification of the treatment assignment model, both of which are problematic assumptions in many observational studies.

Methodology stat.ML, stat.ME, stat.AP

Removing Hidden Confounding by Experimental Grounding

no code implementations NeurIPS 2018 Nathan Kallus, Aahlad Manas Puli, Uri Shalit

We introduce a novel method of using limited experimental data to correct the hidden confounding in causal effect models trained on larger observational data, even if the observational data does not fully overlap with the experimental data.

Causal Inference

Interval Estimation of Individual-Level Causal Effects Under Unobserved Confounding

no code implementations5 Oct 2018 Nathan Kallus, Xiaojie Mao, Angela Zhou

We study the problem of learning conditional average treatment effects (CATE) from observational data with unobserved confounders.

Residual Unfairness in Fair Machine Learning from Prejudiced Data

no code implementations ICML 2018 Nathan Kallus, Angela Zhou

We connect these lines of work and study the residual unfairness that arises when a fairness-adjusted predictor is not actually fair on the target population due to systematic censoring of training data by existing biased policies.

Fairness

Optimal Balancing of Time-Dependent Confounders for Marginal Structural Models

1 code implementation4 Jun 2018 Nathan Kallus, Michele Santacatterina

Marginal structural models (MSMs) estimate the causal effect of a time-varying treatment in the presence of time-dependent confounding via weighted regression.

Confounding-Robust Policy Improvement

no code implementations NeurIPS 2018 Nathan Kallus, Angela Zhou

We study the problem of learning personalized decision policies from observational data while accounting for possible unobserved confounding.

Causal Inference

Policy Evaluation and Optimization with Continuous Treatments

no code implementations16 Feb 2018 Nathan Kallus, Angela Zhou

We study the problem of policy evaluation and learning from batched contextual bandit data when treatments are continuous, going beyond previous work on discrete treatments.

DeepMatch: Balancing Deep Covariate Representations for Causal Inference Using Adversarial Training

no code implementations ICML 2020 Nathan Kallus

We study optimal covariate balance for causal inferences from observational data when rich covariates and complex relationships necessitate flexible modeling with neural networks.

Causal Inference

Instrument-Armed Bandits

no code implementations21 May 2017 Nathan Kallus

We argue for a particular kind of regret that captures the causal effect of treatments but show that standard MAB algorithms cannot achieve sublinear control on this regret.

Balanced Policy Evaluation and Learning

no code implementations NeurIPS 2018 Nathan Kallus

We propose a new, balance-based approach that too makes the data look like the new policy but does so directly by finding weights that optimize for balance between the weighted data and the target policy in the given, finite sample, which is equivalent to minimizing worst-case or posterior conditional mean square error.

Generalized Optimal Matching Methods for Causal Inference

no code implementations26 Dec 2016 Nathan Kallus

We develop an encompassing framework for matching, covariate balancing, and doubly-robust methods for causal inference from observational data called generalized optimal matching (GOM).

Causal Inference

Dynamic Assortment Personalization in High Dimensions

no code implementations18 Oct 2016 Nathan Kallus, Madeleine Udell

In the dynamic setting, we show that structure-aware dynamic assortment personalization can have regret that is an order of magnitude smaller than structure-ignorant approaches.

Recursive Partitioning for Personalization using Observational Data

no code implementations ICML 2017 Nathan Kallus

We study the problem of learning to choose from m discrete treatment options (e. g., news item or medical drug) the one with best causal effect for a particular instance (e. g., user or patient) where the training data consists of passive observations of covariates, treatment, and the outcome of the treatment.

Revealed Preference at Scale: Learning Personalized Preferences from Assortment Choices

no code implementations17 Sep 2015 Nathan Kallus, Madeleine Udell

In our model, the preferences of each customer or segment follow a separate parametric choice model, but the underlying structure of these parameters over all the models has low dimension.

From Predictive to Prescriptive Analytics

1 code implementation22 Feb 2014 Dimitris Bertsimas, Nathan Kallus

To demonstrate the power of our approach in a real-world setting we study an inventory management problem faced by the distribution arm of an international media conglomerate, which ships an average of 1bil units per year.

Stochastic Optimization

Cannot find the paper you are looking for? You can Submit a new open access paper.