Search Results for author: Yash Chandak

Found 23 papers, 11 papers with code

A/B testing under Interference with Partial Network Information

no code implementations • 16 Apr 2024 • Shiv Shankar, Ritwik Sinha, Yash Chandak, Saayan Mitra, Madalina Fiterau

A/B tests are often required to be conducted on subjects that might have social connections.

Paper
Add Code

Adaptive Instrument Design for Indirect Experiments

no code implementations • 5 Dec 2023 • Yash Chandak, Shiv Shankar, Vasilis Syrgkanis, Emma Brunskill

Indirect experiments provide a valuable framework for estimating treatment effects in situations where conducting randomized control trials (RCTs) is impractical or unethical.

Paper
Add Code

Coagent Networks: Generalized and Scaled

no code implementations • 16 May 2023 • James E. Kostas, Scott M. Jordan, Yash Chandak, Georgios Theocharous, Dhawal Gupta, Martha White, Bruno Castro da Silva, Philip S. Thomas

However, the coagent framework is not just an alternative to BDL; the two approaches can be blended: BDL can be combined with coagent learning rules to create architectures with the advantages of both approaches.

Reinforcement Learning (RL)

Paper
Add Code

Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition

no code implementations • 1 May 2023 • Yash Chandak, Shantanu Thakoor, Zhaohan Daniel Guo, Yunhao Tang, Remi Munos, Will Dabney, Diana L Borsa

Representation learning and exploration are among the key challenges for any deep reinforcement learning agent.

reinforcement-learning Representation Learning

Paper
Add Code

Asymptotically Unbiased Off-Policy Policy Evaluation when Reusing Old Data in Nonstationary Environments

no code implementations • 23 Feb 2023 • Vincent Liu, Yash Chandak, Philip Thomas, Martha White

In this work, we consider the off-policy policy evaluation problem for contextual bandits and finite horizon reinforcement learning in the nonstationary setting.

Multi-Armed Bandits regression +2

Paper
Add Code

Optimization using Parallel Gradient Evaluations on Multiple Parameters

no code implementations • 6 Feb 2023 • Yash Chandak, Shiv Shankar, Venkata Gandikota, Philip S. Thomas, Arya Mazumdar

We propose a first-order method for convex optimization, where instead of being restricted to the gradient from a single parameter, gradients from multiple parameters can be used during each step of gradient descent.

Paper
Add Code

Off-Policy Evaluation for Action-Dependent Non-Stationary Environments

1 code implementation • 24 Jan 2023 • Yash Chandak, Shiv Shankar, Nathaniel D. Bastian, Bruno Castro da Silva, Emma Brunskil, Philip S. Thomas

Methods for sequential decision-making are often built upon a foundational assumption that the underlying decision process is stationary.

counterfactual Counterfactual Reasoning +2

Paper
Code

Understanding Self-Predictive Learning for Reinforcement Learning

no code implementations • 6 Dec 2022 • Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Ávila Pires, Yash Chandak, Rémi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko

We identify that a faster paced optimization of the predictor and semi-gradient updates on the representation, are crucial to preventing the representation collapse.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

On Optimizing Interventions in Shared Autonomy

1 code implementation • 16 Dec 2021 • Weihao Tan, David Koleczek, Siddhant Pradhan, Nicholas Perello, Vivek Chettiar, Vishal Rohra, Aaslesha Rajaram, Soundararajan Srinivasan, H M Sajjad Hossain, Yash Chandak

Shared autonomy refers to approaches for enabling an autonomous agent to collaborate with a human with the aim of improving human performance.

Paper
Code

SOPE: Spectrum of Off-Policy Estimators

1 code implementation • NeurIPS 2021 • Christina J. Yuan, Yash Chandak, Stephen Giguere, Philip S. Thomas, Scott Niekum

In this paper, we present a new perspective on this bias-variance trade-off and show the existence of a spectrum of estimators whose endpoints are SIS and IS.

Decision Making Off-policy evaluation

Paper
Code

Universal Off-Policy Evaluation

1 code implementation • NeurIPS 2021 • Yash Chandak, Scott Niekum, Bruno Castro da Silva, Erik Learned-Miller, Emma Brunskill, Philip S. Thomas

When faced with sequential decision-making problems, it is often useful to be able to predict what would happen if decisions were made using a new policy.

counterfactual Decision Making +1

Paper
Code

High-Confidence Off-Policy (or Counterfactual) Variance Estimation

no code implementations • 25 Jan 2021 • Yash Chandak, Shiv Shankar, Philip S. Thomas

Many sequential decision-making systems leverage data collected using prior policies to propose a new policy.

counterfactual Decision Making +1

Paper
Add Code

Towards Safe Policy Improvement for Non-Stationary MDPs

1 code implementation • NeurIPS 2020 • Yash Chandak, Scott M. Jordan, Georgios Theocharous, Martha White, Philip S. Thomas

Many real-world sequential decision-making problems involve critical systems with financial risks and human-life risks.

Decision Making reinforcement-learning +4

Paper
Code

Reinforcement Learning for Strategic Recommendations

no code implementations • 15 Sep 2020 • Georgios Theocharous, Yash Chandak, Philip S. Thomas, Frits de Nijs

Strategic recommendations (SR) refer to the problem where an intelligent agent observes the sequential behaviors and activities of users and decides when and how to interact with them to optimize some long-term objectives, both for the user and the business.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Evaluating the Performance of Reinforcement Learning Algorithms

1 code implementation • ICML 2020 • Scott M. Jordan, Yash Chandak, Daniel Cohen, Mengxue Zhang, Philip S. Thomas

Performance evaluations are critical for quantifying algorithmic advances in reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Optimizing for the Future in Non-Stationary MDPs

1 code implementation • ICML 2020 • Yash Chandak, Georgios Theocharous, Shiv Shankar, Martha White, Sridhar Mahadevan, Philip S. Thomas

Most reinforcement learning methods are based upon the key assumption that the transition dynamics and reward functions are fixed, that is, the underlying Markov decision process is stationary.

Paper
Code

Classical Policy Gradient: Preserving Bellman's Principle of Optimality

no code implementations • 6 Jun 2019 • Philip S. Thomas, Scott M. Jordan, Yash Chandak, Chris Nota, James Kostas

We propose a new objective function for finite-horizon episodic Markov decision processes that better captures Bellman's principle of optimality, and provide an expression for the gradient of the objective.

Paper
Add Code

Reinforcement Learning When All Actions are Not Always Available

1 code implementation • 5 Jun 2019 • Yash Chandak, Georgios Theocharous, Blossom Metevier, Philip S. Thomas

The Markov decision process (MDP) formulation used to model many real-world sequential decision making problems does not efficiently capture the setting where the set of available decisions (actions) at each time step is stochastic.

Decision Making reinforcement-learning +1

Paper
Code

Lifelong Learning with a Changing Action Set

1 code implementation • 5 Jun 2019 • Yash Chandak, Georgios Theocharous, Chris Nota, Philip S. Thomas

have been well-studied in the lifelong learning literature, the setting where the action set changes remains unaddressed.

Decision Making

Paper
Code

Learning Action Representations for Reinforcement Learning

no code implementations • 1 Feb 2019 • Yash Chandak, Georgios Theocharous, James Kostas, Scott Jordan, Philip S. Thomas

Most model-free reinforcement learning methods leverage state representations (embeddings) for generalization, but either ignore structure in the space of actions or assume the structure is provided a priori.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Fusion Graph Convolutional Networks

1 code implementation • 31 May 2018 • Priyesh Vijayan, Yash Chandak, Mitesh M. Khapra, Srinivasan Parthasarathy, Balaraman Ravindran

State-of-the-art models for node classification on such attributed graphs use differentiable recursive functions that enable aggregation and filtering of neighborhood information from multiple hops.

General Classification Node Classification

Paper
Code

HOPF: Higher Order Propagation Framework for Deep Collective Classification

1 code implementation • 31 May 2018 • Priyesh Vijayan, Yash Chandak, Mitesh M. Khapra, Srinivasan Parthasarathy, Balaraman Ravindran

Given a graph where every node has certain attributes associated with it and some nodes have labels associated with them, Collective Classification (CC) is the task of assigning labels to every unlabeled node using information from the node as well as its neighbors.

Attribute Classification +1

Paper
Code

On Optimizing Human-Machine Task Assignments

no code implementations • 24 Sep 2015 • Andreas Veit, Michael Wilber, Rajan Vaish, Serge Belongie, James Davis, Vishal Anand, Anshu Aviral, Prithvijit Chakrabarty, Yash Chandak, Sidharth Chaturvedi, Chinmaya Devaraj, Ankit Dhall, Utkarsh Dwivedi, Sanket Gupte, Sharath N. Sridhar, Karthik Paga, Anuj Pahuja, Aditya Raisinghani, Ayush Sharma, Shweta Sharma, Darpana Sinha, Nisarg Thakkar, K. Bala Vignesh, Utkarsh Verma, Kanniganti Abhishek, Amod Agrawal, Arya Aishwarya, Aurgho Bhattacharjee, Sarveshwaran Dhanasekar, Venkata Karthik Gullapalli, Shuchita Gupta, Chandana G, Kinjal Jain, Simran Kapur, Meghana Kasula, Shashi Kumar, Parth Kundaliya, Utkarsh Mathur, Alankrit Mishra, Aayush Mudgal, Aditya Nadimpalli, Munakala Sree Nihit, Akanksha Periwal, Ayush Sagar, Ayush Shah, Vikas Sharma, Yashovardhan Sharma, Faizal Siddiqui, Virender Singh, Abhinav S., Anurag. D. Yadav

When crowdsourcing systems are used in combination with machine inference systems in the real world, they benefit the most when the machine system is deeply integrated with the crowd workers.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.