no code implementations • NeurIPS 2013 • Asrar Ahmed, Pradeep Varakantham, Yossiri Adulyasak, Patrick Jaillet
Most robust optimization approaches for these problems have focussed on the computation of {\em maximin} policies which maximize the value corresponding to the worst realization of the uncertainty.
no code implementations • 27 Mar 2018 • Tanvi Verma, Pradeep Varakantham, Hoong Chuin Lau
A key characteristic of the domains of interest is that the interactions between individuals are anonymous, i. e., the outcome of an interaction (competing for demand) is dependent only on the number and not on the identity of the agents.
no code implementations • 3 Dec 2018 • Abhinav Bhatia, Pradeep Varakantham, Akshat Kumar
However, existing Deep RL methods are unable to handle combinatorial action spaces and constraints on allocation of resources.
no code implementations • 13 Dec 2018 • Srishti Dhamija, Pradeep Varakantham
To ensure real-time responsiveness in the presence of turn-induced uncertainty, SURTRAC computes schedules which minimize the delay for the expected turn movements as opposed to minimizing the expected delay under turn-induced uncertainty.
1 code implementation • 20 Nov 2019 • Sanket Shah, Meghna Lowalekar, Pradeep Varakantham
This is because even a myopic assignment in ride-pooling involves considering what combinations of passenger requests that can be assigned to vehicles, which adds a layer of combinatorial complexity to the ToD problem.
no code implementations • 20 Nov 2019 • Sanket Shah, Arunesh Sinha, Pradeep Varakantham, Andrew Perrault, Milind Tambe
To solve the online problem with a hard bound on risk, we formulate it as a Reinforcement Learning (RL) problem with constraints on the action space (hard bound on risk).
no code implementations • 22 Jan 2020 • Rajiv Ranjan Kumar, Pradeep Varakantham
Unfortunately, neither of these approaches (or their extensions) are able to address the problem of sparse good experiences effectively.
no code implementations • 16 Mar 2020 • Tanvi Verma, Pradeep Varakantham
For effective matching of resources (e. g., taxis, food, bikes, shopping items) to customer demand, aggregation systems have been extremely successful.
no code implementations • 13 Sep 2020 • Meghna Lowalekar, Pradeep Varakantham, Patrick Jaillet
This challenge has been addressed in existing work by: (i) generating as many relevant feasible (with respect to the available delay for customers) combinations of requests as possible in real-time; and then (ii) optimizing assignment of the feasible request combinations to vehicles.
no code implementations • 16 Sep 2020 • Meghna Lowalekar, Pradeep Varakantham, Patrick Jaillet
The desired matching between resources and request groups is constrained by the edges between requests and request groups in this tripartite graph (i. e., a request can be part of at most one request group in the final assignment).
no code implementations • 7 Mar 2021 • Siddharth Nishtala, Lovish Madaan, Aditya Mate, Harshavardhan Kamarthi, Anirudh Grama, Divy Thakkar, Dhyanesh Narayanan, Suresh Chaudhary, Neha Madhiwalla, Ramesh Padmanabhan, Aparna Hegde, Pradeep Varakantham, Balaraman Ravindran, Milind Tambe
India has a maternal mortality ratio of 113 and child mortality ratio of 2830 per 100, 000 live births.
no code implementations • 17 May 2021 • Arpita Biswas, Gaurav Aggarwal, Pradeep Varakantham, Milind Tambe
In many public health settings, it is important for patients to adhere to health programs, such as taking medications and periodic health checks.
no code implementations • 8 Jul 2021 • Dexun Li, Meghna Lowalekar, Pradeep Varakantham
Influence maximization is the problem of finding a small subset of nodes in a network that can maximize the diffusion of information.
no code implementations • 16 Sep 2021 • Aditya Mate, Lovish Madaan, Aparna Taneja, Neha Madhiwalla, Shresth Verma, Gargi Singh, Aparna Hegde, Pradeep Varakantham, Milind Tambe
Our second major contribution is evaluation of our RMAB system in collaboration with an NGO, via a real-world service quality improvement study.
no code implementations • 22 Sep 2021 • Susobhan Ghosh, Pradeep Varakantham, Aniket Bhatkhande, Tamanna Ahmad, Anish Andheria, Wenjun Li, Aparna Taneja, Divy Thakkar, Milind Tambe
With increasing world population and expanded use of forests as cohabited regions, interactions and conflicts with wildlife are increasing, leading to large-scale loss of lives (animal and human) and livelihoods (economic).
no code implementations • 1 Dec 2021 • Avinandan Bose, Pradeep Varakantham
Owing to the benefits for customers (lower prices), drivers (higher revenues), aggregation companies (higher revenues) and the environment (fewer vehicles), on-demand ride pooling (e. g., Uber pool, Grab Share) has become quite popular.
no code implementations • 8 Jun 2022 • Dexun Li, Pradeep Varakantham
In this paper, we are interested in ensuring that RMAB decision making is also fair to different arms while maximizing expected value.
no code implementations • 27 Jul 2022 • Dexun Li, Pradeep Varakantham
To avoid starvation in the executed interventions across individuals/regions/communities, we first provide a soft fairness constraint and then provide an approach to enforce the soft fairness constraint in RMABs.
no code implementations • 27 Dec 2022 • Tanvi Verma, Pradeep Varakantham
In multi-agent systems with large number of agents, typically the contribution of each agent to the value of other agents is minimal (e. g., aggregation systems such as Uber, Deliveroo).
no code implementations • 19 Jan 2023 • Wenjun Li, Pradeep Varakantham, Dexun Li
Agent decision making using Reinforcement Learning (RL) heavily relies on either a model or simulator of the environment (e. g., moving in an 8x8 maze with three rooms, playing Chess on an 8x8 board).
no code implementations • 27 Jan 2023 • Hao Jiang, Tien Mai, Pradeep Varakantham, Minh Huy Hoang
Constrained Reinforcement Learning has been employed to enforce safety constraints on policy through the use of expected cost constraints.
no code implementations • 4 Feb 2023 • Dexun Li, Wenjun Li, Pradeep Varakantham
In this paper, we aim to introduce diversity in the Unsupervised Environment Design (UED) framework.
no code implementations • 14 Feb 2023 • Roman Belaire, Pradeep Varakantham, Thanh Nguyen, David Lo
We demonstrate that our approaches provide a significant improvement in performance across a wide variety of benchmarks against leading approaches for robust Deep RL.
no code implementations • 21 Feb 2023 • Yuxiao Lu, Arunesh Sinha, Pradeep Varakantham
Safety in goal directed Reinforcement Learning (RL) settings has typically been handled through constraints over trajectories and have demonstrated good performance in primarily short horizon tasks.
no code implementations • 21 Feb 2023 • Xianjie Zhang, Pradeep Varakantham, Hao Jiang
Traditionally, both these challenges have been studied individually and using myopic approaches (considering only current requests), without considering the impact of current matching on addressing future requests.
no code implementations • 22 Jun 2023 • Sidney Tio, Pradeep Varakantham
In this paper, we introduce a method named Parameterized Environment Response Model (PERM) that shows promising results in training RL agents in parameterized environments.
no code implementations • 30 Sep 2023 • Dexun Li, Pradeep Varakantham
Unsupervised Environment Design (UED) is a paradigm for automatically generating a curriculum of training environments, enabling agents trained in these environments to develop general capabilities, i. e., achieving good zero-shot transfer performance.
1 code implementation • NeurIPS 2023 • Changyu Chen, Ramesha Karunasena, Thanh Hong Nguyen, Arunesh Sinha, Pradeep Varakantham
Many problems in Reinforcement Learning (RL) seek an optimal policy with large discrete multidimensional yet unordered action spaces; these include problems in randomized allocation of resources such as placements of multiple security resources and emergency response units, etc.
no code implementations • 4 Dec 2023 • Sidney Tio, Jimmy Ho, Pradeep Varakantham
We adapt Parameterized Environment Response Model (PERM), a method for training both Reinforcement Learning (RL) Agents and human learners in parameterized environments by directly modeling difficulty and ability.
1 code implementation • 16 Dec 2023 • Huy Hoang, Tien Mai, Pradeep Varakantham
In an exhaustive set of experiments, we demonstrate that our approach is able to outperform top benchmark approaches for solving Constrained RL problems, with respect to expected cost, CVaR cost, or even unknown cost constraints.
no code implementations • 20 Feb 2024 • Huy Hoang, Tien Mai, Pradeep Varakantham
Most of the existing offline IL methods developed for this setting are based on behavior cloning or distribution matching, where the aim is to match the occupancy distribution of the imitation policy with that of the expert policy.
1 code implementation • 26 Mar 2024 • Qian Shao, Pradeep Varakantham, Shih-Fen Cheng
Generally speaking, imitation learning is designed to learn either the reward (or preference) model or directly the behavioral policy by observing the behavior of an expert.