Search Results for author: Pradeep Varakantham

Found 32 papers, 4 papers with code

Neural Approximate Dynamic Programming for On-Demand Ride-Pooling

1 code implementation • 20 Nov 2019 • Sanket Shah, Meghna Lowalekar, Pradeep Varakantham

This is because even a myopic assignment in ride-pooling involves considering what combinations of passenger requests that can be assigned to vehicles, which adds a layer of combinatorial complexity to the ToD problem.

Paper
Code

Generative Modelling of Stochastic Actions with Arbitrary Constraints in Reinforcement Learning

1 code implementation • NeurIPS 2023 • Changyu Chen, Ramesha Karunasena, Thanh Hong Nguyen, Arunesh Sinha, Pradeep Varakantham

Many problems in Reinforcement Learning (RL) seek an optimal policy with large discrete multidimensional yet unordered action spaces; these include problems in randomized allocation of resources such as placements of multiple security resources and emergency response units, etc.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Imitate the Good and Avoid the Bad: An Incremental Approach to Safe Reinforcement Learning

1 code implementation • 16 Dec 2023 • Huy Hoang, Tien Mai, Pradeep Varakantham

In an exhaustive set of experiments, we demonstrate that our approach is able to outperform top benchmark approaches for solving Constrained RL problems, with respect to expected cost, CVaR cost, or even unknown cost constraints.

Reinforcement Learning (RL) Safe Reinforcement Learning

Paper
Code

Imitating Cost-Constrained Behaviors in Reinforcement Learning

1 code implementation • 26 Mar 2024 • Qian Shao, Pradeep Varakantham, Shih-Fen Cheng

Generally speaking, imitation learning is designed to learn either the reward (or preference) model or directly the behavioral policy by observing the behavior of an expert.

Imitation Learning reinforcement-learning +1

Paper
Code

Entropy based Independent Learning in Anonymous Multi-Agent Settings

no code implementations • 27 Mar 2018 • Tanvi Verma, Pradeep Varakantham, Hoong Chuin Lau

A key characteristic of the domains of interest is that the interactions between individuals are anonymous, i. e., the outcome of an interaction (competing for demand) is dependent only on the number and not on the identity of the agents.

Fairness Multi-agent Reinforcement Learning

Paper
Add Code

Resource Constrained Deep Reinforcement Learning

no code implementations • 3 Dec 2018 • Abhinav Bhatia, Pradeep Varakantham, Akshat Kumar

However, existing Deep RL methods are unable to handle combinatorial action spaces and constraints on allocation of resources.

Management reinforcement-learning +1

Paper
Add Code

TuSeRACT: Turn-Sample-Based Real-Time Traffic Signal Control

no code implementations • 13 Dec 2018 • Srishti Dhamija, Pradeep Varakantham

To ensure real-time responsiveness in the presence of turn-induced uncertainty, SURTRAC computes schedules which minimize the delay for the expected turn movements as opposed to minimizing the expected delay under turn-induced uncertainty.

Scheduling

Paper
Add Code

Regret based Robust Solutions for Uncertain Markov Decision Processes

no code implementations • NeurIPS 2013 • Asrar Ahmed, Pradeep Varakantham, Yossiri Adulyasak, Patrick Jaillet

Most robust optimization approaches for these problems have focussed on the computation of {\em maximin} policies which maximize the value corresponding to the worst realization of the uncertainty.

Paper
Add Code

Solving Online Threat Screening Games using Constrained Action Space Reinforcement Learning

no code implementations • 20 Nov 2019 • Sanket Shah, Arunesh Sinha, Pradeep Varakantham, Andrew Perrault, Milind Tambe

To solve the online problem with a hard bound on risk, we formulate it as a Reinforcement Learning (RL) problem with constraints on the action space (hard bound on risk).

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

On Solving Cooperative MARL Problems with a Few Good Experiences

no code implementations • 22 Jan 2020 • Rajiv Ranjan Kumar, Pradeep Varakantham

Unfortunately, neither of these approaches (or their extensions) are able to address the problem of sparse good experiences effectively.

Descriptive Multi-agent Reinforcement Learning +2

Paper
Add Code

Value Variance Minimization for Learning Approximate Equilibrium in Aggregation Systems

no code implementations • 16 Mar 2020 • Tanvi Verma, Pradeep Varakantham

For effective matching of resources (e. g., taxis, food, bikes, shopping items) to customer demand, aggregation systems have been extremely successful.

Multi-agent Reinforcement Learning

Paper
Add Code

Zone pAth Construction (ZAC) based Approaches for Effective Real-Time Ridesharing

no code implementations • 13 Sep 2020 • Meghna Lowalekar, Pradeep Varakantham, Patrick Jaillet

This challenge has been addressed in existing work by: (i) generating as many relevant feasible (with respect to the available delay for customers) combinations of requests as possible in real-time; and then (ii) optimizing assignment of the feasible request combinations to vehicles.

Paper
Add Code

Competitive Ratios for Online Multi-capacity Ridesharing

no code implementations • 16 Sep 2020 • Meghna Lowalekar, Pradeep Varakantham, Patrick Jaillet

The desired matching between resources and request groups is constrained by the edges between requests and request groups in this tripartite graph (i. e., a request can be part of at most one request group in the final assignment).

Paper
Add Code

Selective Intervention Planning using Restless Multi-Armed Bandits to Improve Maternal and Child Health Outcomes

no code implementations • 7 Mar 2021 • Siddharth Nishtala, Lovish Madaan, Aditya Mate, Harshavardhan Kamarthi, Anirudh Grama, Divy Thakkar, Dhyanesh Narayanan, Suresh Chaudhary, Neha Madhiwalla, Ramesh Padmanabhan, Aparna Hegde, Pradeep Varakantham, Balaraman Ravindran, Milind Tambe

India has a maternal mortality ratio of 113 and child mortality ratio of 2830 per 100, 000 live births.

Multi-Armed Bandits

Paper
Add Code

Learn to Intervene: An Adaptive Learning Policy for Restless Bandits in Application to Preventive Healthcare

no code implementations • 17 May 2021 • Arpita Biswas, Gaurav Aggarwal, Pradeep Varakantham, Milind Tambe

In many public health settings, it is important for patients to adhere to health programs, such as taking medications and periodic health checks.

Q-Learning

Paper
Add Code

CLAIM: Curriculum Learning Policy for Influence Maximization in Unknown Social Networks

no code implementations • 8 Jul 2021 • Dexun Li, Meghna Lowalekar, Pradeep Varakantham

Influence maximization is the problem of finding a small subset of nodes in a network that can maximize the diffusion of information.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Field Study in Deploying Restless Multi-Armed Bandits: Assisting Non-Profits in Improving Maternal and Child Health

no code implementations • 16 Sep 2021 • Aditya Mate, Lovish Madaan, Aparna Taneja, Neha Madhiwalla, Shresth Verma, Gargi Singh, Aparna Hegde, Pradeep Varakantham, Milind Tambe

Our second major contribution is evaluation of our RMAB system in collaboration with an NGO, via a real-world service quality improvement study.

Multi-Armed Bandits

Paper
Add Code

Facilitating human-wildlife cohabitation through conflict prediction

no code implementations • 22 Sep 2021 • Susobhan Ghosh, Pradeep Varakantham, Aniket Bhatkhande, Tamanna Ahmad, Anish Andheria, Wenjun Li, Aparna Taneja, Divy Thakkar, Milind Tambe

With increasing world population and expanded use of forests as cohabited regions, interactions and conflicts with wildlife are increasing, leading to large-scale loss of lives (animal and human) and livelihoods (economic).

Paper
Add Code

Conditional Expectation based Value Decomposition for Scalable On-Demand Ride Pooling

no code implementations • 1 Dec 2021 • Avinandan Bose, Pradeep Varakantham

Owing to the benefits for customers (lower prices), drivers (higher revenues), aggregation companies (higher revenues) and the environment (fewer vehicles), on-demand ride pooling (e. g., Uber pool, Grab Share) has become quite popular.

Decision Making

Paper
Add Code

Efficient Resource Allocation with Fairness Constraints in Restless Multi-Armed Bandits

no code implementations • 8 Jun 2022 • Dexun Li, Pradeep Varakantham

In this paper, we are interested in ensuring that RMAB decision making is also fair to different arms while maximizing expected value.

Decision Making Fairness +1

Paper
Add Code

Towards Soft Fairness in Restless Multi-Armed Bandits

no code implementations • 27 Jul 2022 • Dexun Li, Pradeep Varakantham

To avoid starvation in the executed interventions across individuals/regions/communities, we first provide a soft fairness constraint and then provide an approach to enforce the soft fairness constraint in RMABs.

Fairness Multi-Armed Bandits

Paper
Add Code

Learning Individual Policies in Large Multi-agent Systems through Local Variance Minimization

no code implementations • 27 Dec 2022 • Tanvi Verma, Pradeep Varakantham

In multi-agent systems with large number of agents, typically the contribution of each agent to the value of other agents is minimal (e. g., aggregation systems such as Uber, Deliveroo).

Multi-agent Reinforcement Learning

Paper
Add Code

Generalization through Diversity: Improving Unsupervised Environment Design

no code implementations • 19 Jan 2023 • Wenjun Li, Pradeep Varakantham, Dexun Li

Agent decision making using Reinforcement Learning (RL) heavily relies on either a model or simulator of the environment (e. g., moving in an 8x8 maze with three rooms, playing Chess on an 8x8 board).

Decision Making Reinforcement Learning (RL)

Paper
Add Code

Solving Richly Constrained Reinforcement Learning through State Augmentation and Reward Penalties

no code implementations • 27 Jan 2023 • Hao Jiang, Tien Mai, Pradeep Varakantham, Minh Huy Hoang

Constrained Reinforcement Learning has been employed to enforce safety constraints on policy through the use of expected cost constraints.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Diversity Induced Environment Design via Self-Play

no code implementations • 4 Feb 2023 • Dexun Li, Wenjun Li, Pradeep Varakantham

In this paper, we aim to introduce diversity in the Unsupervised Environment Design (UED) framework.

Paper
Add Code

Regret-Based Defense in Adversarial Reinforcement Learning

no code implementations • 14 Feb 2023 • Roman Belaire, Pradeep Varakantham, Thanh Nguyen, David Lo

We demonstrate that our approaches provide a significant improvement in performance across a wide variety of benchmarks against leading approaches for robust Deep RL.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Handling Long and Richly Constrained Tasks through Constrained Hierarchical Reinforcement Learning

no code implementations • 21 Feb 2023 • Yuxiao Lu, Arunesh Sinha, Pradeep Varakantham

Safety in goal directed Reinforcement Learning (RL) settings has typically been handled through constraints over trajectories and have demonstrated good performance in primarily short horizon tasks.

Decision Making Hierarchical Reinforcement Learning +2

Paper
Add Code

Future Aware Pricing and Matching for Sustainable On-demand Ride Pooling

no code implementations • 21 Feb 2023 • Xianjie Zhang, Pradeep Varakantham, Hao Jiang

Traditionally, both these challenges have been studied individually and using myopic approaches (considering only current requests), without considering the impact of current matching on addressing future requests.

Paper
Add Code

Transferable Curricula through Difficulty Conditioned Generators

no code implementations • 22 Jun 2023 • Sidney Tio, Pradeep Varakantham

In this paper, we introduce a method named Parameterized Environment Response Model (PERM) that shows promising results in training RL agents in parameterized environments.

Reinforcement Learning (RL) Starcraft +1

Paper
Add Code

Enhancing the Hierarchical Environment Design via Generative Trajectory Modeling

no code implementations • 30 Sep 2023 • Dexun Li, Pradeep Varakantham

Unsupervised Environment Design (UED) is a paradigm for automatically generating a curriculum of training environments, enabling agents trained in these environments to develop general capabilities, i. e., achieving good zero-shot transfer performance.

Trajectory Modeling

Paper
Add Code

Training Reinforcement Learning Agents and Humans With Difficulty-Conditioned Generators

no code implementations • 4 Dec 2023 • Sidney Tio, Jimmy Ho, Pradeep Varakantham

We adapt Parameterized Environment Response Model (PERM), a method for training both Reinforcement Learning (RL) Agents and human learners in parameterized environments by directly modeling difficulty and ability.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

SubIQ: Inverse Soft-Q Learning for Offline Imitation with Suboptimal Demonstrations

no code implementations • 20 Feb 2024 • Huy Hoang, Tien Mai, Pradeep Varakantham

Most of the existing offline IL methods developed for this setting are based on behavior cloning or distribution matching, where the aim is to match the occupancy distribution of the imitation policy with that of the expert policy.

Imitation Learning Q-Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.