Search Results for author: Nathan Kallus

Found 98 papers, 40 papers with code

Double Reinforcement Learning for Efficient and Robust Off-Policy Evaluation

no code implementations • ICML 2020 • Nathan Kallus, Masatoshi Uehara

Off-policy evaluation (OPE) in reinforcement learning allows one to evaluate novel decision policies without needing to conduct exploration, which is often costly or otherwise infeasible.

Off-policy evaluation reinforcement-learning

Paper
Add Code

Efficient and Sharp Off-Policy Evaluation in Robust Markov Decision Processes

no code implementations • 29 Mar 2024 • Andrew Bennett, Nathan Kallus, Miruna Oprescu, Wen Sun, Kaiwen Wang

We characterize the sharp bounds on policy value under this model, that is, the tightest possible bounds given by the transition observations from the original MDP, and we study the estimation of these bounds from such transition observations.

Off-policy evaluation

Paper
Add Code

Hessian-Free Laplace in Bayesian Deep Learning

no code implementations • 15 Mar 2024 • James McInerney, Nathan Kallus

The Laplace approximation (LA) of the Bayesian posterior is a Gaussian distribution centered at the maximum a posteriori estimate.

Paper
Add Code

Risk-Sensitive RL with Optimized Certainty Equivalents via Reduction to Standard RL

no code implementations • 10 Mar 2024 • Kaiwen Wang, Dawen Liang, Nathan Kallus, Wen Sun

We study Risk-Sensitive Reinforcement Learning (RSRL) with the Optimized Certainty Equivalent (OCE) risk, which generalizes Conditional Value-at-risk (CVaR), entropic risk and Markowitz's mean-variance.

Paper
Add Code

Switching the Loss Reduces the Cost in Batch Reinforcement Learning

no code implementations • 8 Mar 2024 • Alex Ayoub, Kaiwen Wang, Vincent Liu, Samuel Robertson, James McInerney, Dawen Liang, Nathan Kallus, Csaba Szepesvári

We propose training fitted Q-iteration with log-loss (FQI-LOG) for batch reinforcement learning (RL).

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Is Cosine-Similarity of Embeddings Really About Similarity?

no code implementations • 8 Mar 2024 • Harald Steck, Chaitanya Ekanadham, Nathan Kallus

Cosine-similarity is the cosine of the angle between two vectors, or equivalently the dot product between their normalizations.

Paper
Add Code

Applied Causal Inference Powered by ML and AI

1 code implementation • 4 Mar 2024 • Victor Chernozhukov, Christian Hansen, Nathan Kallus, Martin Spindler, Vasilis Syrgkanis

An introduction to the emerging fusion of machine learning and causal inference.

Causal Inference

Paper
Code

More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning

no code implementations • 11 Feb 2024 • Kaiwen Wang, Owen Oertell, Alekh Agarwal, Nathan Kallus, Wen Sun

Second-order bounds are instance-dependent bounds that scale with the variance of return, which we prove are tighter than the previously known small-loss bounds of distributional RL.

Distributional Reinforcement Learning Multi-Armed Bandits +1

Paper
Add Code

Peeking with PEAK: Sequential, Nonparametric Composite Hypothesis Tests for Means of Multiple Data Streams

no code implementations • 9 Feb 2024 • Brian Cho, Kyra Gan, Nathan Kallus

We propose a novel nonparametric sequential test for composite hypotheses for means of multiple data streams.

Computational Efficiency

Paper
Add Code

Multi-Armed Bandits with Interference

no code implementations • 2 Feb 2024 • Su Jia, Peter Frazier, Nathan Kallus

Prior research on experimentation with interference has concentrated on the final output of a policy.

Multi-Armed Bandits

Paper
Add Code

Clustered Switchback Experiments: Near-Optimal Rates Under Spatiotemporal Interference

no code implementations • 25 Dec 2023 • Su Jia, Nathan Kallus, Christina Lee Yu

We consider experimentation in the presence of non-stationarity, inter-unit (spatial) interference, and carry-over effects (temporal interference), where we wish to estimate the global average treatment effect (GATE), the difference between average outcomes having exposed all units at all times to treatment or to control.

Experimental Design

Paper
Add Code

Low-Rank MDPs with Continuous Action Spaces

no code implementations • 6 Nov 2023 • Andrew Bennett, Nathan Kallus, Miruna Oprescu

Low-Rank Markov Decision Processes (MDPs) have recently emerged as a promising framework within the domain of reinforcement learning (RL), as they allow for provably approximately correct (PAC) learning guarantees while also incorporating ML algorithms for representation learning.

PAC learning Reinforcement Learning (RL) +1

Paper
Add Code

Off-Policy Evaluation for Large Action Spaces via Policy Convolution

no code implementations • 24 Oct 2023 • Noveen Sachdeva, Lequn Wang, Dawen Liang, Nathan Kallus, Julian McAuley

To address these challenges, we introduce the Policy Convolution (PC) family of estimators.

Multi-Armed Bandits Off-policy evaluation

Paper
Add Code

Large Language Models as Zero-Shot Conversational Recommenders

1 code implementation • 19 Aug 2023 • Zhankui He, Zhouhang Xie, Rahul Jha, Harald Steck, Dawen Liang, Yesu Feng, Bodhisattwa Prasad Majumder, Nathan Kallus, Julian McAuley

In this paper, we present empirical studies on conversational recommendation tasks using representative large language models in a zero-shot setting with three primary contributions.

Paper
Code

Source Condition Double Robust Inference on Functionals of Inverse Problems

no code implementations • 25 Jul 2023 • Andrew Bennett, Nathan Kallus, Xiaojie Mao, Whitney Newey, Vasilis Syrgkanis, Masatoshi Uehara

We consider estimation of parameters defined as linear functionals of solutions to linear inverse problems.

Paper
Add Code

JoinGym: An Efficient Query Optimization Environment for Reinforcement Learning

1 code implementation • 21 Jul 2023 • Kaiwen Wang, Junxiong Wang, Yueying Li, Nathan Kallus, Immanuel Trummer, Wen Sun

Join order selection (JOS) is the problem of ordering join operations to minimize total query execution cost and it is the core NP-hard combinatorial optimization problem of query optimization.

Benchmarking Combinatorial Optimization +3

Paper
Code

The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning

no code implementations • NeurIPS 2023 • Kaiwen Wang, Kevin Zhou, Runzhe Wu, Nathan Kallus, Wen Sun

In online RL, we propose a DistRL algorithm that constructs confidence sets using maximum likelihood estimation.

Distributional Reinforcement Learning Offline RL +1

Paper
Add Code

Provable Offline Preference-Based Reinforcement Learning

no code implementations • 24 May 2023 • Wenhao Zhan, Masatoshi Uehara, Nathan Kallus, Jason D. Lee, Wen Sun

Our proposed algorithm consists of two main steps: (1) estimate the implicit reward using Maximum Likelihood Estimation (MLE) with general function approximation from offline data and (2) solve a distributionally robust planning problem over a confidence set around the MLE.

reinforcement-learning

Paper
Add Code

B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under Hidden Confounding

2 code implementations • 20 Apr 2023 • Miruna Oprescu, Jacob Dorn, Marah Ghoummaid, Andrew Jesson, Nathan Kallus, Uri Shalit

There has been recent progress on robust and efficient methods for estimating the conditional average treatment effect (CATE) function, but these methods often do not take into account the risk of hidden confounding, which could arbitrarily and unknowingly bias any causal estimate based on observational data.

valid

Paper
Code

Minimax Instrumental Variable Regression and $L_2$ Convergence Guarantees without Identification or Closedness

no code implementations • 10 Feb 2023 • Andrew Bennett, Nathan Kallus, Xiaojie Mao, Whitney Newey, Vasilis Syrgkanis, Masatoshi Uehara

In this paper, we study nonparametric estimation of instrumental variable (IV) regressions.

regression valid

Paper
Add Code

Near-Minimax-Optimal Risk-Sensitive Reinforcement Learning with CVaR

no code implementations • 7 Feb 2023 • Kaiwen Wang, Nathan Kallus, Wen Sun

In this paper, we study risk-sensitive Reinforcement Learning (RL), focusing on the objective of Conditional Value at Risk (CVaR) with risk tolerance $\tau$.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Smooth Non-Stationary Bandits

no code implementations • 29 Jan 2023 • Su Jia, Qian Xie, Nathan Kallus, Peter I. Frazier

In many applications of online decision making, the environment is non-stationary and it is therefore crucial to use bandit algorithms that handle changes.

Decision Making

Paper
Add Code

Near-Optimal Non-Parametric Sequential Tests and Confidence Sequences with Possibly Dependent Observations

1 code implementation • 29 Dec 2022 • Aurelien Bibaut, Nathan Kallus, Michael Lindon

The type-I-error results primarily leverage a martingale strong invariance principle and establish that these tests (and their implied confidence sequences) have type-I error rates asymptotically equivalent to the desired (possibly varying) $\alpha$-level.

Decision Making LEMMA +1

Paper
Code

A Review of Off-Policy Evaluation in Reinforcement Learning

no code implementations • 13 Dec 2022 • Masatoshi Uehara, Chengchun Shi, Nathan Kallus

Reinforcement learning (RL) is one of the most vibrant research frontiers in machine learning and has been recently applied to solve a number of challenging problems.

Off-policy evaluation reinforcement-learning

Paper
Add Code

The Implicit Delta Method

1 code implementation • 11 Nov 2022 • Nathan Kallus, James McInerney

When the predictive model is simple and its evaluation differentiable, this task is solved by the delta method, where we propagate the asymptotically-normal uncertainty in the predictive model through the evaluation to compute standard errors and Wald confidence intervals.

Uncertainty Quantification

Paper
Code

Provable Safe Reinforcement Learning with Binary Feedback

1 code implementation • 26 Oct 2022 • Andrew Bennett, Dipendra Misra, Nathan Kallus

Many existing approaches to safe RL rely on receiving numeric safety feedback, but in many cases this feedback can only take binary values; that is, whether an action in a given state is safe or unsafe.

Active Learning reinforcement-learning +2

Paper
Code

Inference on Strongly Identified Functionals of Weakly Identified Functions

no code implementations • 17 Aug 2022 • Andrew Bennett, Nathan Kallus, Xiaojie Mao, Whitney Newey, Vasilis Syrgkanis, Masatoshi Uehara

In a variety of applications, including nonparametric instrumental variable (NPIV) analysis, proximal causal inference under unmeasured confounding, and missing-not-at-random data with shadow variables, we are interested in inference on a continuous linear functional (e. g., average causal effects) of nuisance function (e. g., NPIV regression) defined by conditional moment restrictions.

Causal Inference regression +1

Paper
Add Code

Future-Dependent Value-Based Off-Policy Evaluation in POMDPs

1 code implementation • NeurIPS 2023 • Masatoshi Uehara, Haruka Kiyohara, Andrew Bennett, Victor Chernozhukov, Nan Jiang, Nathan Kallus, Chengchun Shi, Wen Sun

Finally, we extend our methods to learning of dynamics and establish the connection between our approach and the well-known spectral learning methods in POMDPs.

Off-policy evaluation

Paper
Code

Learning Bellman Complete Representations for Offline Policy Evaluation

1 code implementation • 12 Jul 2022 • Jonathan D. Chang, Kaiwen Wang, Nathan Kallus, Wen Sun

We study representation learning for Offline Reinforcement Learning (RL), focusing on the important task of Offline Policy Evaluation (OPE).

Continuous Control Reinforcement Learning (RL) +1

Paper
Code

Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings

no code implementations • 24 Jun 2022 • Masatoshi Uehara, Ayush Sekhari, Jason D. Lee, Nathan Kallus, Wen Sun

We show our algorithm's computational and statistical complexities scale polynomially with respect to the horizon and the intrinsic dimension of the feature on the observation space.

Paper
Add Code

Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems

no code implementations • 24 Jun 2022 • Masatoshi Uehara, Ayush Sekhari, Jason D. Lee, Nathan Kallus, Wen Sun

We study Reinforcement Learning for partially observable dynamical systems using function approximation.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Robust and Agnostic Learning of Conditional Distributional Treatment Effects

1 code implementation • 23 May 2022 • Nathan Kallus, Miruna Oprescu

Our method is model-agnostic in that it can provide the best projection of CDTE onto the regression model class.

regression

Paper
Code

What's the Harm? Sharp Bounds on the Fraction Negatively Affected by Treatment

1 code implementation • 20 May 2022 • Nathan Kallus

), whether exposed to the standard experience A or a new one B, hypothetically it could be because the change affects no one, because the change positively affects half the user population to go from no-click to click while negatively affecting the other half, or something in between.

Causal Inference Fairness

Paper
Code

Estimating Structural Disparities for Face Models

no code implementations • CVPR 2022 • Shervin Ardeshir, Cristina Segalin, Nathan Kallus

Performance of the model for each group is calculated by comparing $\hat{y}$ and $y$ for the datapoints within a specific group, and as a result, disparity of performance across the different groups can be calculated.

Attribute Face Recognition

Paper
Add Code

Doubly Robust Distributionally Robust Off-Policy Evaluation and Learning

1 code implementation • 19 Feb 2022 • Nathan Kallus, Xiaojie Mao, Kaiwen Wang, Zhengyuan Zhou

Thanks to a localization technique, LDR$^2$OPE only requires fitting a small number of regressions, just like DR methods for standard OPE.

Off-policy evaluation

Paper
Code

Long-term Causal Inference Under Persistent Confounding via Data Combination

no code implementations • 15 Feb 2022 • Guido Imbens, Nathan Kallus, Xiaojie Mao, Yuhao Wang

In this paper, we uniquely tackle the challenge of persistent unmeasured confounders, i. e., some unmeasured confounders that can simultaneously affect the treatment, short-term outcomes and the long-term outcome, noting that they invalidate identification strategies in previous literature.

Causal Inference

Paper
Add Code

Treatment Effect Risk: Bounds and Inference

1 code implementation • 15 Jan 2022 • Nathan Kallus

Since the average treatment effect (ATE) measures the change in social welfare, even if positive, there is a risk of negative effect on, say, some 10% of the population.

Paper
Code

Doubly-Valid/Doubly-Sharp Sensitivity Analysis for Causal Inference with Unmeasured Confounding

1 code implementation • 21 Dec 2021 • Jacob Dorn, Kevin Guo, Nathan Kallus

We consider the problem of constructing bounds on the average treatment effect (ATE) when unmeasured confounders exist but have bounded influence.

Causal Inference valid

Paper
Code

Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in Partially Observed Markov Decision Processes

1 code implementation • 28 Oct 2021 • Andrew Bennett, Nathan Kallus

To answer these, we extend the framework of proximal causal inference to our POMDP setting, providing a variety of settings where identification is made possible by the existence of so-called bridge functions.

Causal Inference Management +2

Paper
Code

Stateful Offline Contextual Policy Evaluation and Learning

no code implementations • 19 Oct 2021 • Nathan Kallus, Angela Zhou

We study off-policy evaluation and learning from sequential data in a structured class of Markov decision processes that arise from repeated interactions with an exogenous sequence of arrivals with contexts, which generate unknown individual-level responses to agent actions.

Management Multi-Armed Bandits +1

Paper
Add Code

Residual Overfit Method of Exploration

no code implementations • 6 Oct 2021 • James McInerney, Nathan Kallus

The approach, which we term the residual overfit method of exploration (ROME), drives exploration towards actions where the overfit model exhibits the most overfitting compared to the tuned model.

Uncertainty Quantification

Paper
Add Code

Control Variates for Slate Off-Policy Evaluation

1 code implementation • NeurIPS 2021 • Nikos Vlassis, Ashok Chandrashekar, Fernando Amat Gil, Nathan Kallus

We study the problem of off-policy evaluation from batched contextual bandit data with multidimensional actions, often termed slates.

Off-policy evaluation Recommendation Systems

Paper
Code

Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning

no code implementations • NeurIPS 2021 • Aurélien Bibaut, Antoine Chambaz, Maria Dimakopoulou, Nathan Kallus, Mark van der Laan

Empirical risk minimization (ERM) is the workhorse of machine learning, whether for classification and regression or for off-policy policy learning, but its model-agnostic guarantees can fail when we use adaptively collected data, such as the result of running a contextual bandit algorithm.

regression

Paper
Add Code

Post-Contextual-Bandit Inference

no code implementations • NeurIPS 2021 • Aurélien Bibaut, Antoine Chambaz, Maria Dimakopoulou, Nathan Kallus, Mark van der Laan

The adaptive nature of the data collected by contextual bandit algorithms, however, makes this difficult: standard estimators are no longer asymptotically normally distributed and classic confidence intervals fail to provide correct coverage.

valid

Paper
Add Code

Causal Inference Under Unmeasured Confounding With Negative Controls: A Minimax Learning Approach

no code implementations • 25 Mar 2021 • Nathan Kallus, Xiaojie Mao, Masatoshi Uehara

Previous work has relied on completeness conditions on these functions to identify the causal parameters and required uniqueness assumptions in estimation, and they also focused on parametric estimation of bridge functions.

Causal Inference

Paper
Add Code

Finite Sample Analysis of Minimax Offline Reinforcement Learning: Completeness, Fast Rates and First-Order Efficiency

no code implementations • 5 Feb 2021 • Masatoshi Uehara, Masaaki Imaizumi, Nan Jiang, Nathan Kallus, Wen Sun, Tengyang Xie

We offer a theoretical characterization of off-policy evaluation (OPE) in reinforcement learning using function approximation for marginal importance weights and $q$-functions when these are estimated using recent minimax methods.

Off-policy evaluation reinforcement-learning

Paper
Add Code

Fast Rates for the Regret of Offline Reinforcement Learning

no code implementations • 31 Jan 2021 • Yichun Hu, Nathan Kallus, Masatoshi Uehara

Second, we provide new analyses of FQI and Bellman residual minimization to establish the correct pointwise convergence guarantees.

Decision Making reinforcement-learning +1

Paper
Add Code

Fairness, Welfare, and Equity in Personalized Pricing

no code implementations • 21 Dec 2020 • Nathan Kallus, Angela Zhou

These different application areas may lead to different concerns around fairness, welfare, and equity on different objectives: price burdens on consumers, price envy, firm revenue, access to a good, equal access, and distributional consequences when the good in question further impacts downstream outcomes of interest.

Fairness

Paper
Add Code

The Variational Method of Moments

2 code implementations • 17 Dec 2020 • Andrew Bennett, Nathan Kallus

The conditional moment problem is a powerful formulation for describing structural causal parameters in terms of observables, a prominent example being instrumental variable regression.

valid

Paper
Code

Rejoinder: New Objectives for Policy Learning

no code implementations • 5 Dec 2020 • Nathan Kallus

I provide a rejoinder for discussion of "More Efficient Policy Learning via Optimal Retargeting" to appear in the Journal of the American Statistical Association with discussion by Oliver Dukes and Stijn Vansteelandt; Sijia Li, Xiudi Li, and Alex Luedtkeand; and Muxuan Liang and Yingqi Zhao.

Paper
Add Code

Fast Rates for Contextual Linear Optimization

no code implementations • 5 Nov 2020 • Yichun Hu, Nathan Kallus, Xiaojie Mao

While one may use off-the-shelf machine learning methods to separately learn a predictive model and plug it in, a variety of recent methods instead integrate estimation and optimization by fitting the model to directly optimize downstream decision performance.

Decision Making

Paper
Add Code

Optimal Off-Policy Evaluation from Multiple Logging Policies

1 code implementation • 21 Oct 2020 • Nathan Kallus, Yuta Saito, Masatoshi Uehara

We study off-policy evaluation (OPE) from multiple logging policies, each generating a dataset of fixed size, i. e., stratified sampling.

Off-policy evaluation

Paper
Code

Stochastic Optimization Forests

1 code implementation • 17 Aug 2020 • Nathan Kallus, Xiaojie Mao

We study contextual stochastic optimization problems, where we leverage rich auxiliary observations (e. g., product characteristics) to improve decision making with uncertain variables (e. g., demand).

Decision Making Stochastic Optimization

Paper
Code

Off-policy Evaluation in Infinite-Horizon Reinforcement Learning with Latent Confounders

no code implementations • 27 Jul 2020 • Andrew Bennett, Nathan Kallus, Lihong Li, Ali Mousavi

We study an OPE problem in an infinite-horizon, ergodic Markov decision process with unobserved confounders, where states and actions can act as proxies for the unobserved confounders.

Off-policy evaluation reinforcement-learning

Paper
Add Code

Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies

1 code implementation • NeurIPS 2020 • Nathan Kallus, Masatoshi Uehara

Targeting deterministic policies, for which action is a deterministic function of state, is crucial since optimal policies are always deterministic (up to ties).

Paper
Code

Efficient Evaluation of Natural Stochastic Policies in Offline Reinforcement Learning

no code implementations • 6 Jun 2020 • Nathan Kallus, Masatoshi Uehara

Compared with the classic case of a pre-specified evaluation policy, when evaluating natural stochastic policies, the efficiency bound, which measures the best-achievable estimation error, is inflated since the evaluation policy itself is unknown.

Off-policy evaluation reinforcement-learning

Paper
Add Code

On the Optimality of Randomization in Experimental Design: How to Randomize for Minimax Variance and Design-Based Inference

no code implementations • 6 May 2020 • Nathan Kallus

When this set is permutation symmetric, the optimal design is complete randomization, and using a single partition (i. e., the design that only randomizes the treatment labels for each side of the partition) has minimax risk larger by a factor of $n-1$.

Experimental Design

Paper
Add Code

DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret

1 code implementation • 6 May 2020 • Yichun Hu, Nathan Kallus

While existing literature mostly focuses on estimating the optimal DTR from offline data such as from sequentially randomized trials, we study the problem of developing the optimal DTR in an online manner, where the interaction with each individual affect both our cumulative reward and our data collection for future learning.

Paper
Code

Comment: Entropy Learning for Dynamic Treatment Regimes

no code implementations • 6 Apr 2020 • Nathan Kallus

I congratulate Profs.

General Classification

Paper
Add Code

On the role of surrogates in the efficient estimation of treatment effects with limited outcome data

no code implementations • 27 Mar 2020 • Nathan Kallus, Xiaojie Mao

However, there is often an abundance of observations on surrogate outcomes not of primary interest, such as short-term health effects or online-ad click-through.

Marketing

Paper
Add Code

Efficient Policy Learning from Surrogate-Loss Classification Reductions

1 code implementation • ICML 2020 • Andrew Bennett, Nathan Kallus

We show that, under a correct specification assumption, the weighted classification formulation need not be efficient for policy parameters.

Binary Classification Classification +1

Paper
Code

Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement Learning

no code implementations • NeurIPS 2020 • Nathan Kallus, Angela Zhou

We develop a robust approach that estimates sharp bounds on the (unidentifiable) value of a given policy in an infinite-horizon problem given data from another policy with unobserved confounding, subject to a sensitivity model.

Off-policy evaluation reinforcement-learning

Paper
Add Code

Statistically Efficient Off-Policy Policy Gradients

no code implementations • ICML 2020 • Nathan Kallus, Masatoshi Uehara

Policy gradient methods in reinforcement learning update policy parameters by taking steps in the direction of an estimated gradient of policy value.

Policy Gradient Methods

Paper
Add Code

Generalization Bounds and Representation Learning for Estimation of Potential Outcomes and Causal Effects

no code implementations • 21 Jan 2020 • Fredrik D. Johansson, Uri Shalit, Nathan Kallus, David Sontag

Practitioners in diverse fields such as healthcare, economics and education are eager to apply machine learning to improve decision making.

Decision Making Generalization Bounds +2

Paper
Add Code

Localized Debiased Machine Learning: Efficient Inference on Quantile Treatment Effects and Beyond

1 code implementation • 30 Dec 2019 • Nathan Kallus, Xiaojie Mao, Masatoshi Uehara

A central example is the efficient estimating equation for the (local) quantile treatment effect ((L)QTE) in causal inference, which involves as a nuisance the covariate-conditional cumulative distribution function evaluated at the quantile to be estimated.

BIG-bench Machine Learning Causal Inference

Paper
Code

Assessing Disparate Impact of Personalized Interventions: Identifiability and Bounds

1 code implementation • NeurIPS 2019 • Nathan Kallus, Angela Zhou

Personalized interventions in social services, education, and healthcare leverage individual-level causal effect predictions in order to give the best treatment to each individual or to prioritize program interventions for the individuals most likely to benefit.

Fairness

Paper
Code

Kernel Optimal Orthogonality Weighting: A Balancing Approach to Estimating Effects of Continuous Treatments

no code implementations • 26 Oct 2019 • Nathan Kallus, Michele Santacatterina

In this paper, we propose Kernel Optimal Orthogonality Weighting (KOOW), a convex optimization-based method, for estimating the effects of continuous treatments.

Paper
Add Code

Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning

no code implementations • 12 Sep 2019 • Nathan Kallus, Masatoshi Uehara

This precisely characterizes the curse of horizon: in time-variant processes, OPE is only feasible in the near-on-policy setting, where behavior and target policies are sufficiently similar.

Off-policy evaluation reinforcement-learning

Paper
Add Code

Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes

1 code implementation • 5 Sep 2019 • Yichun Hu, Nathan Kallus, Xiaojie Mao

We study a nonparametric contextual bandit problem where the expected reward functions belong to a H\"older class with smoothness parameter $\beta$.

Multi-Armed Bandits

Paper
Code

Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes

1 code implementation • 22 Aug 2019 • Nathan Kallus, Masatoshi Uehara

Off-policy evaluation (OPE) in reinforcement learning allows one to evaluate novel decision policies without needing to conduct exploration, which is often costly or otherwise infeasible.

Off-policy evaluation reinforcement-learning

Paper
Code

Optimal Estimation of Generalized Average Treatment Effects using Kernel Optimal Matching

1 code implementation • 13 Aug 2019 • Nathan Kallus, Michele Santacatterina

In causal inference, a variety of causal effect estimands have been studied, including the sample, uncensored, target, conditional, optimal subpopulation, and optimal weighted average treatment effects.

Causal Inference

Paper
Code

Policy Evaluation with Latent Confounders via Optimal Balance

1 code implementation • NeurIPS 2019 • Andrew Bennett, Nathan Kallus

We study the question of policy evaluation when we instead have proxies for the latent confounders and develop an importance weighting method that avoids fitting a latent outcome regression model.

regression

Paper
Code

More Efficient Policy Learning via Optimal Retargeting

1 code implementation • 20 Jun 2019 • Nathan Kallus

Policy learning can be used to extract individualized treatment regimes from observational data in healthcare, civics, e-commerce, and beyond.

Paper
Code

Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning

1 code implementation • NeurIPS 2019 • Nathan Kallus, Masatoshi Uehara

We propose new estimators for OPE based on empirical likelihood that are always more efficient than IS, SNIS, and DR and satisfy the same stability and boundedness properties as SNIS.

Multi-Armed Bandits Off-policy evaluation +1

Paper
Code

Assessing Disparate Impacts of Personalized Interventions: Identifiability and Bounds

1 code implementation • 4 Jun 2019 • Nathan Kallus, Angela Zhou

Fairness

Paper
Code

Assessing Algorithmic Fairness with Unobserved Protected Class Using Data Combination

1 code implementation • 1 Jun 2019 • Nathan Kallus, Xiaojie Mao, Angela Zhou

In this paper we study a fundamental challenge to assessing disparate impacts in practice: protected class membership is often not observed in the data.

Fairness

Paper
Code

Data-Pooling in Stochastic Optimization

1 code implementation • 1 Jun 2019 • Vishal Gupta, Nathan Kallus

This intuition further suggests that data-pooling offers the most benefits when there are many problems, each of which has a small amount of relevant data.

Management Stochastic Optimization

Paper
Code

Deep Generalized Method of Moments for Instrumental Variable Analysis

2 code implementations • NeurIPS 2019 • Andrew Bennett, Nathan Kallus, Tobias Schnabel

Instrumental variable analysis is a powerful tool for estimating causal effects when randomization or full control of confounders is not possible.

Model Selection

Paper
Code

The Fairness of Risk Scores Beyond Classification: Bipartite Ranking and the xAUC Metric

1 code implementation • NeurIPS 2019 • Nathan Kallus, Angela Zhou

To better account for this, in this paper, we investigate the fairness of predictive risk scores from the point of view of a bipartite ranking task, where one seeks to rank positive examples higher than negative ones.

Binary Classification Fairness +1

Paper
Code

Classifying Treatment Responders Under Causal Effect Monotonicity

no code implementations • 14 Feb 2019 • Nathan Kallus

In the context of individual-level causal inference, we study the problem of predicting whether someone will respond or not to a treatment based on their features and past examples of features, treatment indicator (e. g., drug/no drug), and a binary outcome (e. g., recovery from disease).

Causal Inference Classification +1

Paper
Add Code

Fairness Under Unawareness: Assessing Disparity When Protected Class Is Unobserved

1 code implementation • 27 Nov 2018 • Jiahao Chen, Nathan Kallus, Xiaojie Mao, Geoffry Svacha, Madeleine Udell

We also propose an alternative weighted estimator that uses soft classification, and show that its bias arises simply from the conditional covariance of the outcome with the true class membership.

Decision Making Fairness +2

Paper
Code

More robust estimation of sample average treatment effects using Kernel Optimal Matching in an observational study of spine surgical interventions

1 code implementation • 10 Nov 2018 • Nathan Kallus, Brenton Pennicooke, Michele Santacatterina

Inverse probability of treatment weighting (IPTW), which has been used to estimate sample average treatment effects (SATE) using observational data, tenuously relies on the positivity assumption and the correct specification of the treatment assignment model, both of which are problematic assumptions in many observational studies.

Methodology stat.ML, stat.ME, stat.AP

Paper
Code

Removing Hidden Confounding by Experimental Grounding

no code implementations • NeurIPS 2018 • Nathan Kallus, Aahlad Manas Puli, Uri Shalit

We introduce a novel method of using limited experimental data to correct the hidden confounding in causal effect models trained on larger observational data, even if the observational data does not fully overlap with the experimental data.

Causal Inference

Paper
Add Code

Interval Estimation of Individual-Level Causal Effects Under Unobserved Confounding

no code implementations • 5 Oct 2018 • Nathan Kallus, Xiaojie Mao, Angela Zhou

We study the problem of learning conditional average treatment effects (CATE) from observational data with unobserved confounders.

Paper
Add Code

Residual Unfairness in Fair Machine Learning from Prejudiced Data

no code implementations • ICML 2018 • Nathan Kallus, Angela Zhou

We connect these lines of work and study the residual unfairness that arises when a fairness-adjusted predictor is not actually fair on the target population due to systematic censoring of training data by existing biased policies.

BIG-bench Machine Learning Fairness

Paper
Add Code

Optimal Balancing of Time-Dependent Confounders for Marginal Structural Models

1 code implementation • 4 Jun 2018 • Nathan Kallus, Michele Santacatterina

Marginal structural models (MSMs) estimate the causal effect of a time-varying treatment in the presence of time-dependent confounding via weighted regression.

Paper
Code

Causal Inference with Noisy and Missing Covariates via Matrix Factorization

1 code implementation • NeurIPS 2018 • Nathan Kallus, Xiaojie Mao, Madeleine Udell

Valid causal inference in observational studies often requires controlling for confounders.

Causal Inference Matrix Completion +1

Paper
Code

Confounding-Robust Policy Improvement

no code implementations • NeurIPS 2018 • Nathan Kallus, Angela Zhou

We study the problem of learning personalized decision policies from observational data while accounting for possible unobserved confounding.

Causal Inference

Paper
Add Code

Learning Weighted Representations for Generalization Across Designs

no code implementations • ICLR 2018 • Fredrik D. Johansson, Nathan Kallus, Uri Shalit, David Sontag

We pose both of these problems as prediction under a shift in design.

Causal Inference Representation Learning +1

Paper
Add Code

Policy Evaluation and Optimization with Continuous Treatments

no code implementations • 16 Feb 2018 • Nathan Kallus, Angela Zhou

We study the problem of policy evaluation and learning from batched contextual bandit data when treatments are continuous, going beyond previous work on discrete treatments.

Paper
Add Code

DeepMatch: Balancing Deep Covariate Representations for Causal Inference Using Adversarial Training

no code implementations • ICML 2020 • Nathan Kallus

We study optimal covariate balance for causal inferences from observational data when rich covariates and complex relationships necessitate flexible modeling with neural networks.

Causal Inference

Paper
Add Code

Instrument-Armed Bandits

no code implementations • 21 May 2017 • Nathan Kallus

We argue for a particular kind of regret that captures the causal effect of treatments but show that standard MAB algorithms cannot achieve sublinear control on this regret.

Ethics

Paper
Add Code

Balanced Policy Evaluation and Learning

no code implementations • NeurIPS 2018 • Nathan Kallus

We propose a new, balance-based approach that too makes the data look like the new policy but does so directly by finding weights that optimize for balance between the weighted data and the target policy in the given, finite sample, which is equivalent to minimizing worst-case or posterior conditional mean square error.

Paper
Add Code

Generalized Optimal Matching Methods for Causal Inference

no code implementations • 26 Dec 2016 • Nathan Kallus

We develop an encompassing framework for matching, covariate balancing, and doubly-robust methods for causal inference from observational data called generalized optimal matching (GOM).

Causal Inference

Paper
Add Code

Dynamic Assortment Personalization in High Dimensions

no code implementations • 18 Oct 2016 • Nathan Kallus, Madeleine Udell

In the dynamic setting, we show that structure-aware dynamic assortment personalization can have regret that is an order of magnitude smaller than structure-ignorant approaches.

Management Vocal Bursts Intensity Prediction

Paper
Add Code

Recursive Partitioning for Personalization using Observational Data

no code implementations • ICML 2017 • Nathan Kallus

We study the problem of learning to choose from m discrete treatment options (e. g., news item or medical drug) the one with best causal effect for a particular instance (e. g., user or patient) where the training data consists of passive observations of covariates, treatment, and the outcome of the treatment.

Paper
Add Code

Revealed Preference at Scale: Learning Personalized Preferences from Assortment Choices

no code implementations • 17 Sep 2015 • Nathan Kallus, Madeleine Udell

In our model, the preferences of each customer or segment follow a separate parametric choice model, but the underlying structure of these parameters over all the models has low dimension.

Paper
Add Code

From Predictive to Prescriptive Analytics

1 code implementation • 22 Feb 2014 • Dimitris Bertsimas, Nathan Kallus

To demonstrate the power of our approach in a real-world setting we study an inventory management problem faced by the distribution arm of an international media conglomerate, which ships an average of 1bil units per year.

Management Stochastic Optimization

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.