Search Results for author: Pradeep Varakantham

Found 32 papers, 4 papers with code

Neural Approximate Dynamic Programming for On-Demand Ride-Pooling

1 code implementation20 Nov 2019 Sanket Shah, Meghna Lowalekar, Pradeep Varakantham

This is because even a myopic assignment in ride-pooling involves considering what combinations of passenger requests that can be assigned to vehicles, which adds a layer of combinatorial complexity to the ToD problem.

Generative Modelling of Stochastic Actions with Arbitrary Constraints in Reinforcement Learning

1 code implementation NeurIPS 2023 Changyu Chen, Ramesha Karunasena, Thanh Hong Nguyen, Arunesh Sinha, Pradeep Varakantham

Many problems in Reinforcement Learning (RL) seek an optimal policy with large discrete multidimensional yet unordered action spaces; these include problems in randomized allocation of resources such as placements of multiple security resources and emergency response units, etc.

reinforcement-learning Reinforcement Learning (RL) +1

Imitate the Good and Avoid the Bad: An Incremental Approach to Safe Reinforcement Learning

1 code implementation16 Dec 2023 Huy Hoang, Tien Mai, Pradeep Varakantham

In an exhaustive set of experiments, we demonstrate that our approach is able to outperform top benchmark approaches for solving Constrained RL problems, with respect to expected cost, CVaR cost, or even unknown cost constraints.

Reinforcement Learning (RL) Safe Reinforcement Learning

Imitating Cost-Constrained Behaviors in Reinforcement Learning

1 code implementation26 Mar 2024 Qian Shao, Pradeep Varakantham, Shih-Fen Cheng

Generally speaking, imitation learning is designed to learn either the reward (or preference) model or directly the behavioral policy by observing the behavior of an expert.

Imitation Learning reinforcement-learning +1

Entropy based Independent Learning in Anonymous Multi-Agent Settings

no code implementations27 Mar 2018 Tanvi Verma, Pradeep Varakantham, Hoong Chuin Lau

A key characteristic of the domains of interest is that the interactions between individuals are anonymous, i. e., the outcome of an interaction (competing for demand) is dependent only on the number and not on the identity of the agents.

Fairness Multi-agent Reinforcement Learning

Resource Constrained Deep Reinforcement Learning

no code implementations3 Dec 2018 Abhinav Bhatia, Pradeep Varakantham, Akshat Kumar

However, existing Deep RL methods are unable to handle combinatorial action spaces and constraints on allocation of resources.

Management reinforcement-learning +1

TuSeRACT: Turn-Sample-Based Real-Time Traffic Signal Control

no code implementations13 Dec 2018 Srishti Dhamija, Pradeep Varakantham

To ensure real-time responsiveness in the presence of turn-induced uncertainty, SURTRAC computes schedules which minimize the delay for the expected turn movements as opposed to minimizing the expected delay under turn-induced uncertainty.

Scheduling

Regret based Robust Solutions for Uncertain Markov Decision Processes

no code implementations NeurIPS 2013 Asrar Ahmed, Pradeep Varakantham, Yossiri Adulyasak, Patrick Jaillet

Most robust optimization approaches for these problems have focussed on the computation of {\em maximin} policies which maximize the value corresponding to the worst realization of the uncertainty.

Solving Online Threat Screening Games using Constrained Action Space Reinforcement Learning

no code implementations20 Nov 2019 Sanket Shah, Arunesh Sinha, Pradeep Varakantham, Andrew Perrault, Milind Tambe

To solve the online problem with a hard bound on risk, we formulate it as a Reinforcement Learning (RL) problem with constraints on the action space (hard bound on risk).

reinforcement-learning Reinforcement Learning (RL)

On Solving Cooperative MARL Problems with a Few Good Experiences

no code implementations22 Jan 2020 Rajiv Ranjan Kumar, Pradeep Varakantham

Unfortunately, neither of these approaches (or their extensions) are able to address the problem of sparse good experiences effectively.

Descriptive Multi-agent Reinforcement Learning +2

Value Variance Minimization for Learning Approximate Equilibrium in Aggregation Systems

no code implementations16 Mar 2020 Tanvi Verma, Pradeep Varakantham

For effective matching of resources (e. g., taxis, food, bikes, shopping items) to customer demand, aggregation systems have been extremely successful.

Multi-agent Reinforcement Learning

Zone pAth Construction (ZAC) based Approaches for Effective Real-Time Ridesharing

no code implementations13 Sep 2020 Meghna Lowalekar, Pradeep Varakantham, Patrick Jaillet

This challenge has been addressed in existing work by: (i) generating as many relevant feasible (with respect to the available delay for customers) combinations of requests as possible in real-time; and then (ii) optimizing assignment of the feasible request combinations to vehicles.

Competitive Ratios for Online Multi-capacity Ridesharing

no code implementations16 Sep 2020 Meghna Lowalekar, Pradeep Varakantham, Patrick Jaillet

The desired matching between resources and request groups is constrained by the edges between requests and request groups in this tripartite graph (i. e., a request can be part of at most one request group in the final assignment).

Learn to Intervene: An Adaptive Learning Policy for Restless Bandits in Application to Preventive Healthcare

no code implementations17 May 2021 Arpita Biswas, Gaurav Aggarwal, Pradeep Varakantham, Milind Tambe

In many public health settings, it is important for patients to adhere to health programs, such as taking medications and periodic health checks.

Q-Learning

CLAIM: Curriculum Learning Policy for Influence Maximization in Unknown Social Networks

no code implementations8 Jul 2021 Dexun Li, Meghna Lowalekar, Pradeep Varakantham

Influence maximization is the problem of finding a small subset of nodes in a network that can maximize the diffusion of information.

reinforcement-learning Reinforcement Learning (RL)

Facilitating human-wildlife cohabitation through conflict prediction

no code implementations22 Sep 2021 Susobhan Ghosh, Pradeep Varakantham, Aniket Bhatkhande, Tamanna Ahmad, Anish Andheria, Wenjun Li, Aparna Taneja, Divy Thakkar, Milind Tambe

With increasing world population and expanded use of forests as cohabited regions, interactions and conflicts with wildlife are increasing, leading to large-scale loss of lives (animal and human) and livelihoods (economic).

Conditional Expectation based Value Decomposition for Scalable On-Demand Ride Pooling

no code implementations1 Dec 2021 Avinandan Bose, Pradeep Varakantham

Owing to the benefits for customers (lower prices), drivers (higher revenues), aggregation companies (higher revenues) and the environment (fewer vehicles), on-demand ride pooling (e. g., Uber pool, Grab Share) has become quite popular.

Decision Making

Efficient Resource Allocation with Fairness Constraints in Restless Multi-Armed Bandits

no code implementations8 Jun 2022 Dexun Li, Pradeep Varakantham

In this paper, we are interested in ensuring that RMAB decision making is also fair to different arms while maximizing expected value.

Decision Making Fairness +1

Towards Soft Fairness in Restless Multi-Armed Bandits

no code implementations27 Jul 2022 Dexun Li, Pradeep Varakantham

To avoid starvation in the executed interventions across individuals/regions/communities, we first provide a soft fairness constraint and then provide an approach to enforce the soft fairness constraint in RMABs.

Fairness Multi-Armed Bandits

Learning Individual Policies in Large Multi-agent Systems through Local Variance Minimization

no code implementations27 Dec 2022 Tanvi Verma, Pradeep Varakantham

In multi-agent systems with large number of agents, typically the contribution of each agent to the value of other agents is minimal (e. g., aggregation systems such as Uber, Deliveroo).

Multi-agent Reinforcement Learning

Generalization through Diversity: Improving Unsupervised Environment Design

no code implementations19 Jan 2023 Wenjun Li, Pradeep Varakantham, Dexun Li

Agent decision making using Reinforcement Learning (RL) heavily relies on either a model or simulator of the environment (e. g., moving in an 8x8 maze with three rooms, playing Chess on an 8x8 board).

Decision Making Reinforcement Learning (RL)

Solving Richly Constrained Reinforcement Learning through State Augmentation and Reward Penalties

no code implementations27 Jan 2023 Hao Jiang, Tien Mai, Pradeep Varakantham, Minh Huy Hoang

Constrained Reinforcement Learning has been employed to enforce safety constraints on policy through the use of expected cost constraints.

reinforcement-learning Reinforcement Learning (RL)

Diversity Induced Environment Design via Self-Play

no code implementations4 Feb 2023 Dexun Li, Wenjun Li, Pradeep Varakantham

In this paper, we aim to introduce diversity in the Unsupervised Environment Design (UED) framework.

Regret-Based Defense in Adversarial Reinforcement Learning

no code implementations14 Feb 2023 Roman Belaire, Pradeep Varakantham, Thanh Nguyen, David Lo

We demonstrate that our approaches provide a significant improvement in performance across a wide variety of benchmarks against leading approaches for robust Deep RL.

reinforcement-learning Reinforcement Learning (RL)

Handling Long and Richly Constrained Tasks through Constrained Hierarchical Reinforcement Learning

no code implementations21 Feb 2023 Yuxiao Lu, Arunesh Sinha, Pradeep Varakantham

Safety in goal directed Reinforcement Learning (RL) settings has typically been handled through constraints over trajectories and have demonstrated good performance in primarily short horizon tasks.

Decision Making Hierarchical Reinforcement Learning +2

Future Aware Pricing and Matching for Sustainable On-demand Ride Pooling

no code implementations21 Feb 2023 Xianjie Zhang, Pradeep Varakantham, Hao Jiang

Traditionally, both these challenges have been studied individually and using myopic approaches (considering only current requests), without considering the impact of current matching on addressing future requests.

Transferable Curricula through Difficulty Conditioned Generators

no code implementations22 Jun 2023 Sidney Tio, Pradeep Varakantham

In this paper, we introduce a method named Parameterized Environment Response Model (PERM) that shows promising results in training RL agents in parameterized environments.

Reinforcement Learning (RL) Starcraft +1

Enhancing the Hierarchical Environment Design via Generative Trajectory Modeling

no code implementations30 Sep 2023 Dexun Li, Pradeep Varakantham

Unsupervised Environment Design (UED) is a paradigm for automatically generating a curriculum of training environments, enabling agents trained in these environments to develop general capabilities, i. e., achieving good zero-shot transfer performance.

Trajectory Modeling

Training Reinforcement Learning Agents and Humans With Difficulty-Conditioned Generators

no code implementations4 Dec 2023 Sidney Tio, Jimmy Ho, Pradeep Varakantham

We adapt Parameterized Environment Response Model (PERM), a method for training both Reinforcement Learning (RL) Agents and human learners in parameterized environments by directly modeling difficulty and ability.

reinforcement-learning Reinforcement Learning (RL)

SubIQ: Inverse Soft-Q Learning for Offline Imitation with Suboptimal Demonstrations

no code implementations20 Feb 2024 Huy Hoang, Tien Mai, Pradeep Varakantham

Most of the existing offline IL methods developed for this setting are based on behavior cloning or distribution matching, where the aim is to match the occupancy distribution of the imitation policy with that of the expert policy.

Imitation Learning Q-Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.