Search Results for author: Sanath Kumar Krishnamurthy

Found 8 papers, 0 papers with code

Selective Uncertainty Propagation in Offline RL

no code implementations • 1 Feb 2023 • Sanath Kumar Krishnamurthy, Shrey Modi, Tanmay Gangwani, Sumeet Katariya, Branislav Kveton, Anshuka Rangi

We consider the finite-horizon offline reinforcement learning (RL) setting, and are motivated by the challenge of learning the policy at any step h in dynamic programming (DP) algorithms.

Offline RL reinforcement-learning +1

Paper
Add Code

Contextual Bandits in a Survey Experiment on Charitable Giving: Within-Experiment Outcomes versus Policy Learning

no code implementations • 22 Nov 2022 • Susan Athey, Undral Byambadalai, Vitor Hadad, Sanath Kumar Krishnamurthy, Weiwen Leung, Joseph Jay Williams

We design and implement an adaptive experiment (a ``contextual bandit'') to learn a targeted treatment assignment policy, where the goal is to use a participant's survey responses to determine which charity to expose them to in a donation solicitation.

Multi-Armed Bandits

Paper
Add Code

Flexible and Efficient Contextual Bandits with Heterogeneous Treatment Effect Oracles

no code implementations • 30 Mar 2022 • Aldo Gael Carranza, Sanath Kumar Krishnamurthy, Susan Athey

Contextual bandit algorithms often estimate reward models to inform decision-making.

Decision Making Multi-Armed Bandits

Paper
Add Code

Towards Costless Model Selection in Contextual Bandits: A Bias-Variance Perspective

no code implementations • 11 Jun 2021 • Sanath Kumar Krishnamurthy, Adrienne Margaret Propp, Susan Athey

Our algorithm is based on a novel misspecification test, and our analysis demonstrates the benefits of using model selection for reward estimation.

Model Selection Multi-Armed Bandits

Paper
Add Code

Adapting to Misspecification in Contextual Bandits with Offline Regression Oracles

no code implementations • 26 Feb 2021 • Sanath Kumar Krishnamurthy, Vitor Hadad, Susan Athey

Computationally efficient contextual bandits are often based on estimating a predictive model of rewards given contexts and arms using past data.

Multi-Armed Bandits regression

Paper
Add Code

Tractable contextual bandits beyond realizability

no code implementations • 25 Oct 2020 • Sanath Kumar Krishnamurthy, Vitor Hadad, Susan Athey

When realizability does not hold, our algorithm ensures the same guarantees on regret achieved by realizability-based algorithms under realizability, up to an additive term that accounts for the misspecification error.

Multi-Armed Bandits

Paper
Add Code

Survey Bandits with Regret Guarantees

no code implementations • 23 Feb 2020 • Sanath Kumar Krishnamurthy, Susan Athey

We consider a variant of the contextual bandit problem.

Multi-Armed Bandits

Paper
Add Code

Groupwise Maximin Fair Allocation of Indivisible Goods

no code implementations • 21 Nov 2017 • Siddharth Barman, Arpita Biswas, Sanath Kumar Krishnamurthy, Y. Narahari

We also establish the existence of approximate GMMS allocations under additive valuations, and develop a polynomial-time algorithm to find such allocations.

Fairness

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.