Search Results for author: Yash Chandak

Found 19 papers, 11 papers with code

Asymptotically Unbiased Off-Policy Policy Evaluation when Reusing Old Data in Nonstationary Environments

no code implementations23 Feb 2023 Vincent Liu, Yash Chandak, Philip Thomas, Martha White

In this work, we consider the off-policy policy evaluation problem for contextual bandits and finite horizon reinforcement learning in the nonstationary setting.

Multi-Armed Bandits regression +1

Optimization using Parallel Gradient Evaluations on Multiple Parameters

no code implementations6 Feb 2023 Yash Chandak, Shiv Shankar, Venkata Gandikota, Philip S. Thomas, Arya Mazumdar

We propose a first-order method for convex optimization, where instead of being restricted to the gradient from a single parameter, gradients from multiple parameters can be used during each step of gradient descent.

Off-Policy Evaluation for Action-Dependent Non-Stationary Environments

1 code implementation24 Jan 2023 Yash Chandak, Shiv Shankar, Nathaniel D. Bastian, Bruno Castro da Silva, Emma Brunskil, Philip S. Thomas

Methods for sequential decision-making are often built upon a foundational assumption that the underlying decision process is stationary.

Decision Making Off-policy evaluation

On Optimizing Interventions in Shared Autonomy

1 code implementation16 Dec 2021 Weihao Tan, David Koleczek, Siddhant Pradhan, Nicholas Perello, Vivek Chettiar, Vishal Rohra, Aaslesha Rajaram, Soundararajan Srinivasan, H M Sajjad Hossain, Yash Chandak

Shared autonomy refers to approaches for enabling an autonomous agent to collaborate with a human with the aim of improving human performance.

SOPE: Spectrum of Off-Policy Estimators

1 code implementation NeurIPS 2021 Christina J. Yuan, Yash Chandak, Stephen Giguere, Philip S. Thomas, Scott Niekum

In this paper, we present a new perspective on this bias-variance trade-off and show the existence of a spectrum of estimators whose endpoints are SIS and IS.

Decision Making Off-policy evaluation

Universal Off-Policy Evaluation

1 code implementation NeurIPS 2021 Yash Chandak, Scott Niekum, Bruno Castro da Silva, Erik Learned-Miller, Emma Brunskill, Philip S. Thomas

When faced with sequential decision-making problems, it is often useful to be able to predict what would happen if decisions were made using a new policy.

Decision Making Off-policy evaluation

High-Confidence Off-Policy (or Counterfactual) Variance Estimation

no code implementations25 Jan 2021 Yash Chandak, Shiv Shankar, Philip S. Thomas

Many sequential decision-making systems leverage data collected using prior policies to propose a new policy.

Decision Making

Reinforcement Learning for Strategic Recommendations

no code implementations15 Sep 2020 Georgios Theocharous, Yash Chandak, Philip S. Thomas, Frits de Nijs

Strategic recommendations (SR) refer to the problem where an intelligent agent observes the sequential behaviors and activities of users and decides when and how to interact with them to optimize some long-term objectives, both for the user and the business.

reinforcement-learning Reinforcement Learning (RL)

Optimizing for the Future in Non-Stationary MDPs

1 code implementation ICML 2020 Yash Chandak, Georgios Theocharous, Shiv Shankar, Martha White, Sridhar Mahadevan, Philip S. Thomas

Most reinforcement learning methods are based upon the key assumption that the transition dynamics and reward functions are fixed, that is, the underlying Markov decision process is stationary.

Classical Policy Gradient: Preserving Bellman's Principle of Optimality

no code implementations6 Jun 2019 Philip S. Thomas, Scott M. Jordan, Yash Chandak, Chris Nota, James Kostas

We propose a new objective function for finite-horizon episodic Markov decision processes that better captures Bellman's principle of optimality, and provide an expression for the gradient of the objective.

Reinforcement Learning When All Actions are Not Always Available

1 code implementation5 Jun 2019 Yash Chandak, Georgios Theocharous, Blossom Metevier, Philip S. Thomas

The Markov decision process (MDP) formulation used to model many real-world sequential decision making problems does not efficiently capture the setting where the set of available decisions (actions) at each time step is stochastic.

Decision Making reinforcement-learning +1

Lifelong Learning with a Changing Action Set

1 code implementation5 Jun 2019 Yash Chandak, Georgios Theocharous, Chris Nota, Philip S. Thomas

have been well-studied in the lifelong learning literature, the setting where the action set changes remains unaddressed.

Decision Making

Learning Action Representations for Reinforcement Learning

no code implementations1 Feb 2019 Yash Chandak, Georgios Theocharous, James Kostas, Scott Jordan, Philip S. Thomas

Most model-free reinforcement learning methods leverage state representations (embeddings) for generalization, but either ignore structure in the space of actions or assume the structure is provided a priori.

reinforcement-learning Reinforcement Learning (RL)

HOPF: Higher Order Propagation Framework for Deep Collective Classification

1 code implementation31 May 2018 Priyesh Vijayan, Yash Chandak, Mitesh M. Khapra, Srinivasan Parthasarathy, Balaraman Ravindran

Given a graph where every node has certain attributes associated with it and some nodes have labels associated with them, Collective Classification (CC) is the task of assigning labels to every unlabeled node using information from the node as well as its neighbors.

Classification General Classification

Fusion Graph Convolutional Networks

1 code implementation31 May 2018 Priyesh Vijayan, Yash Chandak, Mitesh M. Khapra, Srinivasan Parthasarathy, Balaraman Ravindran

State-of-the-art models for node classification on such attributed graphs use differentiable recursive functions that enable aggregation and filtering of neighborhood information from multiple hops.

General Classification Node Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.