Search Results for author: Milind Tambe

Found 85 papers, 20 papers with code

Rule-Bottleneck Reinforcement Learning: Joint Explanation and Decision Optimization for Resource Allocation with Language Agents

no code implementations15 Feb 2025 Mauricio Tec, Guojun Xiong, Haichuan Wang, Francesca Dominici, Milind Tambe

Deep Reinforcement Learning (RL) is remarkably effective in addressing sequential resource allocation problems in domains such as healthcare, public policy, and resource management.

reinforcement-learning Reinforcement Learning (RL)

On Sequential Fault-Intolerant Process Planning

no code implementations7 Feb 2025 Andrzej Kaczmarczyk, Davin Choo, Niclas Boehmer, Milind Tambe, Haifeng Xu

We propose and study a planning problem we call Sequential Fault-Intolerant Process Planning (SFIPP).

Multilinguality in LLM-Designed Reward Functions for Restless Bandits: Effects on Task Performance and Fairness

no code implementations20 Jan 2025 Ambreesh Parthasarathy, Chandrasekar Subramanian, Ganesh Senrayan, Shreyash Adappanavar, Aparna Taneja, Balaraman Ravindran, Milind Tambe

In this work, we study the effects on both task performance and fairness when the DLM algorithm, a recent work on using LLMs to design reward functions for RMABs, is prompted with non-English language commands.

Fairness Multi-Armed Bandits

Finite-Horizon Single-Pull Restless Bandits: An Efficient Index Policy For Scarce Resource Allocation

no code implementations10 Jan 2025 Guojun Xiong, Haichuan Wang, Yuqi Pan, Saptarshi Mandal, Sanket Shah, Niclas Boehmer, Milind Tambe

However, in many practical settings with highly scarce resources, where each agent can only receive at most one resource, such as healthcare intervention programs, the standard RMAB framework falls short.

Multi-Armed Bandits

IRL for Restless Multi-Armed Bandits with Applications in Maternal and Child Health

1 code implementation11 Dec 2024 Gauri Jain, Pradeep Varakantham, Haifeng Xu, Aparna Taneja, Prashant Doshi, Milind Tambe

To address this shortcoming, this paper is the first to present the use of inverse reinforcement learning (IRL) to learn desired rewards for RMABs, and we demonstrate improved outcomes in a maternal and child health telehealth program.

Multi-Armed Bandits

Towards Foundation-model-based Multiagent System to Accelerate AI for Social Impact

no code implementations10 Dec 2024 Yunfan Zhao, Niclas Boehmer, Aparna Taneja, Milind Tambe

AI for social impact (AI4SI) offers significant potential for addressing complex societal challenges in areas such as public health, agriculture, education, conservation, and public safety.

Contrasting local and global modeling with machine learning and satellite data: A case study estimating tree canopy height in African savannas

no code implementations21 Nov 2024 Esther Rolf, Lucia Gordon, Milind Tambe, Andrew Davies

While advances in machine learning with satellite imagery (SatML) are facilitating environmental monitoring at a global scale, developing SatML models that are accurate and useful for local regions remains critical to understanding and acting on an ever-changing planet.

On Diffusion Models for Multi-Agent Partial Observability: Shared Attractors, Error Bounds, and Composite Flow

no code implementations17 Oct 2024 Tonghan Wang, Heng Dong, Yanchen Jiang, David C. Parkes, Milind Tambe

We further find that, with deep learning approximation errors, fixed points can deviate from true states and the deviation is negatively correlated to the Jacobian rank.

Find Rhinos without Finding Rhinos: Active Learning with Multimodal Imagery of South African Rhino Habitats

1 code implementation26 Sep 2024 Lucia Gordon, Nikhil Behari, Samuel Collier, Elizabeth Bondi-Kelly, Jackson A. Killian, Catherine Ressijac, Peter Boucher, Andrew Davies, Milind Tambe

Much of Earth's charismatic megafauna is endangered by human activities, particularly the rhino, which is at risk of extinction due to the poaching crisis in Africa.

Active Learning

What is the Right Notion of Distance between Predict-then-Optimize Tasks?

no code implementations11 Sep 2024 Paula Rodriguez-Diaz, Lingkai Kong, Kai Wang, David Alvarez-Melis, Milind Tambe

Comparing datasets is a fundamental task in machine learning, essential for various learning paradigms; from evaluating train and test datasets for model generalization to using dataset similarity for detecting data drift.

Informativeness

Improving the Prediction of Individual Engagement in Recommendations Using Cognitive Models

no code implementations28 Aug 2024 Roderick Seow, Yunfan Zhao, Duncan Wood, Milind Tambe, Cleotilde Gonzalez

For public health programs with limited resources, the ability to predict how behaviors change over time and in response to interventions is crucial for deciding when and to whom interventions should be allocated.

Decision Making Time Series

Balancing Act: Prioritization Strategies for LLM-Designed Restless Bandit Rewards

no code implementations22 Aug 2024 Shresth Verma, Niclas Boehmer, Lingkai Kong, Milind Tambe

In the presence of multiple agents, altering the reward function based on human preferences can impact subpopulations very differently, leading to complex tradeoffs and a multi-objective resource allocation problem.

Language Modeling Language Modelling +2

The Bandit Whisperer: Communication Learning for Restless Bandits

no code implementations11 Aug 2024 Yunfan Zhao, Tonghan Wang, Dheeraj Nagaraj, Aparna Taneja, Milind Tambe

Applying Reinforcement Learning (RL) to Restless Multi-Arm Bandits (RMABs) offers a promising avenue for addressing allocation problems with resource constraints and temporal dynamics.

Reinforcement Learning (RL)

Combining Diverse Information for Coordinated Action: Stochastic Bandit Algorithms for Heterogeneous Agents

1 code implementation6 Aug 2024 Lucia Gordon, Esther Rolf, Milind Tambe

Stochastic multi-agent multi-armed bandits typically assume that the rewards from each arm follow a fixed distribution, regardless of which agent pulls the arm.

Multi-Armed Bandits

Transcendence: Generative Models Can Outperform The Experts That Train Them

no code implementations17 Jun 2024 Edwin Zhang, Vincent Zhu, Naomi Saphra, Anat Kleiman, Benjamin L. Edelman, Milind Tambe, Sham M. Kakade, Eran Malach

Generative models are trained with the simple objective of imitating the conditional probability distribution induced by the data they are trained on.

Application-Driven Innovation in Machine Learning

no code implementations26 Mar 2024 David Rolnick, Alan Aspuru-Guzik, Sara Beery, Bistra Dilkina, Priya L. Donti, Marzyeh Ghassemi, Hannah Kerner, Claire Monteleoni, Esther Rolf, Milind Tambe, Adam White

As applications of machine learning proliferate, innovative algorithms inspired by specific real-world challenges have become increasingly important.

A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health

no code implementations22 Feb 2024 Nikhil Behari, Edwin Zhang, Yunfan Zhao, Aparna Taneja, Dheeraj Nagaraj, Milind Tambe

In this paper, we propose a Decision Language Model (DLM) for RMABs, enabling dynamic fine-tuning of RMAB policies in public health settings using human-language commands.

Language Modeling Language Modelling +1

Social Environment Design

1 code implementation21 Feb 2024 Edwin Zhang, Sadie Zhao, Tonghan Wang, Safwan Hossain, Henry Gasztowtt, Stephan Zheng, David C. Parkes, Milind Tambe, YiLing Chen

Artificial Intelligence (AI) holds promise as a technology that can be used to improve government and economic policy-making.

Decision Making

Evaluating the Effectiveness of Index-Based Treatment Allocation

no code implementations19 Feb 2024 Niclas Boehmer, Yash Nair, Sanket Shah, Lucas Janson, Aparna Taneja, Milind Tambe

When resources are scarce, an allocation policy is needed to decide who receives a resource.

valid

Context in Public Health for Underserved Communities: A Bayesian Approach to Online Restless Bandits

no code implementations7 Feb 2024 Biyonka Liang, Lily Xu, Aparna Taneja, Milind Tambe, Lucas Janson

Public health programs often provide interventions to encourage program adherence, and effectively allocating interventions is vital for producing the greatest overall health outcomes, especially in underserved communities where resources are limited.

Reinforcement Learning (RL) Thompson Sampling

Toward Computationally Efficient Inverse Reinforcement Learning via Reward Shaping

no code implementations15 Dec 2023 Lauren H. Cooke, Harvey Klyne, Edwin Zhang, Cassidy Laidlaw, Milind Tambe, Finale Doshi-Velez

Inverse reinforcement learning (IRL) is computationally challenging, with common approaches requiring the solution of multiple reinforcement learning (RL) sub-problems.

reinforcement-learning Reinforcement Learning +1

Analyzing and Predicting Low-Listenership Trends in a Large-Scale Mobile Health Program: A Preliminary Investigation

no code implementations13 Nov 2023 Arshika Lalan, Shresth Verma, Kumar Madhu Sudan, Amrita Mahale, Aparna Hegde, Milind Tambe, Aparna Taneja

Mobile health programs are becoming an increasingly popular medium for dissemination of health information among beneficiaries in less privileged communities.

Time Series Time Series Prediction

Towards a Pretrained Model for Restless Bandits via Multi-arm Generalization

no code implementations23 Oct 2023 Yunfan Zhao, Nikhil Behari, Edward Hughes, Edwin Zhang, Dheeraj Nagaraj, Karl Tuyls, Aparna Taneja, Milind Tambe

Restless multi-arm bandits (RMABs), a class of resource allocation problems with broad application in areas such as healthcare, online advertising, and anti-poaching, have recently been studied from a multi-agent reinforcement learning perspective.

Multi-agent Reinforcement Learning Multi-Armed Bandits +1

Equitable Restless Multi-Armed Bandits: A General Framework Inspired By Digital Health

1 code implementation17 Aug 2023 Jackson A. Killian, Manish Jain, Yugang Jia, Jonathan Amar, Erich Huang, Milind Tambe

RMABs are increasingly being used for sensitive decisions such as in public health, treatment scheduling, anti-poaching, and -- the motivation for this work -- digital health.

Decision Making Fairness +2

Reflections from the Workshop on AI-Assisted Decision Making for Conservation

no code implementations17 Jul 2023 Lily Xu, Esther Rolf, Sara Beery, Joseph R. Bennett, Tanya Berger-Wolf, Tanya Birch, Elizabeth Bondi-Kelly, Justin Brashares, Melissa Chapman, Anthony Corso, Andrew Davies, Nikhil Garg, Angela Gaylard, Robert Heilmayr, Hannah Kerner, Konstantin Klemmer, Vipin Kumar, Lester Mackey, Claire Monteleoni, Paul Moorcroft, Jonathan Palmer, Andrew Perrault, David Thau, Milind Tambe

In this white paper, we synthesize key points made during presentations and discussions from the AI-Assisted Decision Making for Conservation workshop, hosted by the Center for Research on Computation and Society at Harvard University on October 20-21, 2022.

Decision Making

Leaving the Nest: Going Beyond Local Loss Functions for Predict-Then-Optimize

no code implementations26 May 2023 Sanket Shah, Andrew Perrault, Bryan Wilder, Milind Tambe

In this paper, we propose solutions to these issues, avoiding the aforementioned assumptions and utilizing the ML model's features to increase the sample efficiency of learning loss functions.

Decision Making Decision Making Under Uncertainty

Fairness for Workers Who Pull the Arms: An Index Based Policy for Allocation of Restless Bandit Tasks

no code implementations1 Mar 2023 Arpita Biswas, Jackson A. Killian, Paula Rodriguez Diaz, Susobhan Ghosh, Milind Tambe

The goal is to plan an intervention schedule that maximizes the expected reward while satisfying budget constraints on each worker as well as fairness in terms of the load assigned to each worker.

Fairness Multi-Armed Bandits +1

Improved Policy Evaluation for Randomized Trials of Algorithmic Resource Allocation

no code implementations6 Feb 2023 Aditya Mate, Bryan Wilder, Aparna Taneja, Milind Tambe

We consider the task of evaluating policies of algorithmic resource allocation through randomized controlled trials (RCTs).

counterfactual

Decision-Focused Evaluation: Analyzing Performance of Deployed Restless Multi-Arm Bandits

no code implementations19 Jan 2023 Paritosh Verma, Shresth Verma, Aditya Mate, Aparna Taneja, Milind Tambe

Restless multi-arm bandits (RMABs) is a popular decision-theoretic framework that has been used to model real-world sequential decision making problems in public health, wildlife conservation, communication systems, and beyond.

Decision Making Sequential Decision Making

Artificial Intelligence and Life in 2030: The One Hundred Year Study on Artificial Intelligence

no code implementations31 Oct 2022 Peter Stone, Rodney Brooks, Erik Brynjolfsson, Ryan Calo, Oren Etzioni, Greg Hager, Julia Hirschberg, Shivaram Kalyanakrishnan, Ece Kamar, Sarit Kraus, Kevin Leyton-Brown, David Parkes, William Press, AnnaLee Saxenian, Julie Shah, Milind Tambe, Astro Teller

In September 2016, Stanford's "One Hundred Year Study on Artificial Intelligence" project (AI100) issued the first report of its planned long-term periodic assessment of artificial intelligence (AI) and its impact on society.

Indexability is Not Enough for Whittle: Improved, Near-Optimal Algorithms for Restless Bandits

1 code implementation31 Oct 2022 Abheek Ghosh, Dheeraj Nagaraj, Manish Jain, Milind Tambe

Whittle index policies, which are based on Lagrangian relaxations, are widely used in these settings due to their simplicity and near-optimality under certain conditions.

Multi-Armed Bandits

Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits

1 code implementation30 Sep 2022 Siddhartha Banerjee, Sean R. Sinclair, Milind Tambe, Lily Xu, Christina Lee Yu

We show that Artificial-Replay uses only a fraction of the historical data compared to a full warm-start approach, while still achieving identical regret for base algorithms that satisfy independence of irrelevant data (IIData), a novel and broadly applicable property that we introduce.

Open-Ended Question Answering

Optimistic Whittle Index Policy: Online Learning for Restless Bandits

1 code implementation30 May 2022 Kai Wang*, Lily Xu, Aparna Taneja, Milind Tambe

Restless multi-armed bandits (RMABs) extend multi-armed bandits to allow for stateful arms, where the state of each arm evolves restlessly with different transitions depending on whether that arm is pulled.

Multi-Armed Bandits

Ranked Prioritization of Groups in Combinatorial Bandit Allocation

1 code implementation11 May 2022 Lily Xu, Arpita Biswas, Fei Fang, Milind Tambe

Preventing poaching through ranger patrols protects endangered wildlife, directly contributing to the UN Sustainable Development Goal 15 of life on land.

Evolutionary Approach to Security Games with Signaling

no code implementations29 Apr 2022 Adam Żychowski, Jacek Mańdziuk, Elizabeth Bondi, Aravind Venugopal, Milind Tambe, Balaraman Ravindran

Green Security Games have become a popular way to model scenarios involving the protection of natural resources, such as wildlife.

ADVISER: AI-Driven Vaccination Intervention Optimiser for Increasing Vaccine Uptake in Nigeria

no code implementations28 Apr 2022 Vineet Nair, Kritika Prakash, Michael Wilbur, Aparna Taneja, Corinne Namblard, Oyindamola Adeyemo, Abhishek Dubey, Abiodun Adereni, Milind Tambe, Ayan Mukhopadhyay

More than 5 million children under five years die from largely preventable or treatable medical conditions every year, with an overwhelmingly large proportion of deaths occurring in under-developed countries with low vaccination uptake.

Decision-Focused Learning without Differentiable Optimization: Learning Locally Optimized Decision Losses

no code implementations30 Mar 2022 Sanket Shah, Kai Wang, Bryan Wilder, Andrew Perrault, Milind Tambe

Decision-Focused Learning (DFL) is a paradigm for tailoring a predictive model to a downstream optimization task that uses its predictions in order to perform better on that specific task.

Decision Making

Proceedings of the Artificial Intelligence for Cyber Security (AICS) Workshop at AAAI 2022

no code implementations28 Feb 2022 James Holt, Edward Raff, Ahmad Ridley, Dennis Ross, Arunesh Sinha, Diane Staheli, William Streilen, Milind Tambe, Yevgeniy Vorobeychik, Allan Wollaber

These challenges are widely studied in enterprise networks, but there are many gaps in research and practice as well as novel problems in other domains.

Scalable Decision-Focused Learning in Restless Multi-Armed Bandits with Application to Maternal and Child Health

no code implementations2 Feb 2022 Kai Wang, Shresth Verma, Aditya Mate, Sanket Shah, Aparna Taneja, Neha Madhiwalla, Aparna Hegde, Milind Tambe

To address this shortcoming, we propose a novel approach for decision-focused learning in RMAB that directly trains the predictive model to maximize the Whittle index solution quality.

Multi-Armed Bandits Scheduling

Networked Restless Multi-Armed Bandits for Mobile Interventions

no code implementations28 Jan 2022 Han-Ching Ou, Christoph Siebenbrunner, Jackson Killian, Meredith B Brooks, David Kempe, Yevgeniy Vorobeychik, Milind Tambe

Motivated by a broad class of mobile intervention problems, we propose and study restless multi-armed bandits (RMABs) with network effects.

Multi-Armed Bandits

Facilitating human-wildlife cohabitation through conflict prediction

no code implementations22 Sep 2021 Susobhan Ghosh, Pradeep Varakantham, Aniket Bhatkhande, Tamanna Ahmad, Anish Andheria, Wenjun Li, Aparna Taneja, Divy Thakkar, Milind Tambe

With increasing world population and expanded use of forests as cohabited regions, interactions and conflicts with wildlife are increasing, leading to large-scale loss of lives (animal and human) and livelihoods (economic).

Prediction

Restless and Uncertain: Robust Policies for Restless Bandits via Deep Multi-Agent Reinforcement Learning

no code implementations4 Jul 2021 Jackson A. Killian, Lily Xu, Arpita Biswas, Milind Tambe

Our approach uses a double oracle framework (oracles for \textit{agent} and \textit{nature}), which is often used for single-process robust planning but requires significant new techniques to accommodate the combinatorial nature of RMABs.

Deep Reinforcement Learning Multi-agent Reinforcement Learning +2

Q-Learning Lagrange Policies for Multi-Action Restless Bandits

1 code implementation22 Jun 2021 Jackson A. Killian, Arpita Biswas, Sanket Shah, Milind Tambe

Multi-action restless multi-armed bandits (RMABs) are a powerful framework for constrained resource allocation in which $N$ independent processes are managed.

Multi-Armed Bandits Q-Learning

Robust Reinforcement Learning Under Minimax Regret for Green Security

1 code implementation15 Jun 2021 Lily Xu, Andrew Perrault, Fei Fang, Haipeng Chen, Milind Tambe

We formulate the problem as a game between the defender and nature who controls the parameter values of the adversarial behavior and design an algorithm MIRROR to find a robust policy.

Decision Making reinforcement-learning +3

Contingency-Aware Influence Maximization: A Reinforcement Learning Approach

1 code implementation13 Jun 2021 Haipeng Chen, Wei Qiu, Han-Ching Ou, Bo An, Milind Tambe

Empirical results show that our method achieves influence as high as the state-of-the-art methods for contingency-aware IM, while having negligible runtime at test phase.

Combinatorial Optimization reinforcement-learning +2

AI-driven Prices for Externalities and Sustainability in Production Markets

1 code implementation10 Jun 2021 Panayiotis Danassis, Aris Filos-Ratsikas, Haipeng Chen, Milind Tambe, Boi Faltings

Traditional competitive markets do not account for negative externalities; indirect costs that some participants impose on others, such as the cost of over-appropriating a common-pool resource (which diminishes future stock, and thus harvest, for everyone).

Deep Reinforcement Learning Fairness

Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Problems by Reinforcement Learning

no code implementations NeurIPS 2021 Kai Wang, Sanket Shah, Haipeng Chen, Andrew Perrault, Finale Doshi-Velez, Milind Tambe

In the predict-then-optimize framework, the objective is to train a predictive model, mapping from environment features to parameters of an optimization problem, which maximizes decision quality when the optimization is subsequently solved.

Reinforcement Learning (RL)

Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Making by Reinforcement Learning

no code implementations NeurIPS 2021 Kai Wang, Sanket Shah, Haipeng Chen, Andrew Perrault, Finale Doshi-Velez, Milind Tambe

In the predict-then-optimize framework, the objective is to train a predictive model, mapping from environment features to parameters of an optimization problem, which maximizes decision quality when the optimization is subsequently solved.

Decision Making Reinforcement Learning (RL) +1

Learn to Intervene: An Adaptive Learning Policy for Restless Bandits in Application to Preventive Healthcare

no code implementations17 May 2021 Arpita Biswas, Gaurav Aggarwal, Pradeep Varakantham, Milind Tambe

In many public health settings, it is important for patients to adhere to health programs, such as taking medications and periodic health checks.

Q-Learning

Efficient Algorithms for Finite Horizon and Streaming Restless Multi-Armed Bandit Problems

no code implementations8 Mar 2021 Aditya Mate, Arpita Biswas, Christoph Siebenbrunner, Susobhan Ghosh, Milind Tambe

Our contributions are as follows: (1) We derive conditions under which our problem satisfies indexability, a precondition that guarantees the existence and asymptotic optimality of the Whittle Index solution for RMABs.

Multi-Armed Bandits

Active Screening for Recurrent Diseases: A Reinforcement Learning Approach

no code implementations7 Jan 2021 Han-Ching Ou, Haipeng Chen, Shahin Jabbari, Milind Tambe

However, given the limited number of health workers, only a small subset of the population can be visited in any given time period.

Combinatorial Optimization reinforcement-learning +2

Reinforcement Learning for Unified Allocation and Patrolling in Signaling Games with Uncertainty

no code implementations18 Dec 2020 Aravind Venugopal, Elizabeth Bondi, Harshavardhan Kamarthi, Keval Dholakia, Balaraman Ravindran, Milind Tambe

We therefore first propose a novel GSG model that combines defender allocation, patrolling, real-time drone notification to human patrollers, and drones sending warning signals to attackers.

Decision Making Multiagent Systems

Collapsing Bandits and Their Application to Public Health Intervention

1 code implementation NeurIPS 2020 Aditya Mate, Jackson Killian, Haifeng Xu, Andrew Perrault, Milind Tambe

Our main contributions are as follows: (i) Building on the Whittle index technique for RMABs, we derive conditions under which the Collapsing Bandits problem is indexable.

Enhancing Poaching Predictions for Under-Resourced Wildlife Conservation Parks Using Remote Sensing Imagery

no code implementations20 Nov 2020 Rachel Guo, Lily Xu, Drew Cronin, Francis Okeke, Andrew Plumptre, Milind Tambe

To ensure under-resourced parks have access to meaningful poaching predictions, we introduce the use of publicly available remote sensing data to extract features for parks.

Dual-Mandate Patrols: Multi-Armed Bandits for Green Security

2 code implementations14 Sep 2020 Lily Xu, Elizabeth Bondi, Fei Fang, Andrew Perrault, Kai Wang, Milind Tambe

Conservation efforts in green security domains to protect wildlife and forests are constrained by the limited availability of defenders (i. e., patrollers), who must patrol vast areas to protect from attackers (e. g., poachers or illegal loggers).

Multi-Armed Bandits

Tracking disease outbreaks from sparse data with Bayesian inference

no code implementations12 Sep 2020 Bryan Wilder, Michael J. Mina, Milind Tambe

For example, case counts may be sparse when only a small fraction of infections are caught by a testing program.

Bayesian Inference Epidemiology +1

Collapsing Bandits and Their Application to Public Health Interventions

no code implementations5 Jul 2020 Aditya Mate, Jackson A. Killian, Haifeng Xu, Andrew Perrault, Milind Tambe

(ii) We exploit the optimality of threshold policies to build fast algorithms for computing the Whittle index, including a closed-form.

Automatically Learning Compact Quality-aware Surrogates for Optimization Problems

2 code implementations NeurIPS 2020 Kai Wang, Bryan Wilder, Andrew Perrault, Milind Tambe

Solving optimization problems with unknown parameters often requires learning a predictive model to predict the values of the unknown parameters and then solving the problem using these values.

Portfolio Optimization

Fair Influence Maximization: A Welfare Optimization Approach

no code implementations14 Jun 2020 Aida Rahmattalabi, Shahin Jabbari, Himabindu Lakkaraju, Phebe Vayanos, Max Izenberg, Ryan Brown, Eric Rice, Milind Tambe

Under this framework, the trade-off between fairness and efficiency can be controlled by a single inequality aversion design parameter.

Fairness Management

AI for Social Impact: Learning and Planning in the Data-to-Deployment Pipeline

no code implementations16 Dec 2019 Andrew Perrault, Fei Fang, Arunesh Sinha, Milind Tambe

With the maturing of AI and multiagent systems research, we have a tremendous opportunity to direct these advances towards addressing complex societal problems.

Solving Online Threat Screening Games using Constrained Action Space Reinforcement Learning

no code implementations20 Nov 2019 Sanket Shah, Arunesh Sinha, Pradeep Varakantham, Andrew Perrault, Milind Tambe

To solve the online problem with a hard bound on risk, we formulate it as a Reinforcement Learning (RL) problem with constraints on the action space (hard bound on risk).

Deep Reinforcement Learning reinforcement-learning +1

MIPaaL: Mixed Integer Program as a Layer

no code implementations12 Jul 2019 Aaron Ferber, Bryan Wilder, Bistra Dilkina, Milind Tambe

It has been successfully applied to several limited combinatorial problem classes, such as those that can be expressed as linear programs (LP), and submodular optimization.

Decision Making

Influence maximization in unknown social networks: Learning Policies for Effective Graph Sampling

1 code implementation8 Jul 2019 Harshavardhan Kamarthi, Priyesh Vijayan, Bryan Wilder, Balaraman Ravindran, Milind Tambe

A serious challenge when finding influential actors in real-world social networks is the lack of knowledge about the structure of the underlying network.

Graph Sampling Reinforcement Learning

End to end learning and optimization on graphs

1 code implementation NeurIPS 2019 Bryan Wilder, Eric Ewing, Bistra Dilkina, Milind Tambe

However, graphs or related attributes are often only partially observed, introducing learning problems such as link prediction which must be solved prior to optimization.

Link Prediction

Group-Fairness in Influence Maximization

1 code implementation3 Mar 2019 Alan Tsang, Bryan Wilder, Eric Rice, Milind Tambe, Yair Zick

Influence maximization is a widely used model for information dissemination in social networks.

Computer Science and Game Theory Social and Information Networks

End-to-End Game-Focused Learning of Adversary Behavior in Security Games

no code implementations3 Mar 2019 Andrew Perrault, Bryan Wilder, Eric Ewing, Aditya Mate, Bistra Dilkina, Milind Tambe

Stackelberg security games are a critical tool for maximizing the utility of limited defense resources to protect important targets from an intelligent adversary.

Learning to Prescribe Interventions for Tuberculosis Patients Using Digital Adherence Data

no code implementations5 Feb 2019 Jackson A. Killian, Bryan Wilder, Amit Sharma, Daksha Shah, Vinod Choudhary, Bistra Dilkina, Milind Tambe

Digital Adherence Technologies (DATs) are an increasingly popular method for verifying patient adherence to many medications.

Melding the Data-Decisions Pipeline: Decision-Focused Learning for Combinatorial Optimization

no code implementations14 Sep 2018 Bryan Wilder, Bistra Dilkina, Milind Tambe

These components are typically approached separately: a machine learning model is first trained via a measure of predictive accuracy, and then its predictions are used as input into an optimization algorithm which produces a decision.

Combinatorial Optimization

Mitigating the Curse of Correlation in Security Games by Entropy Maximization

no code implementations11 Mar 2017 Haifeng Xu, Milind Tambe, Shaddin Dughmi, Venil Loyd Noronha

To mitigate this issue, we propose to design entropy-maximizing defending strategies for spatio-temporal security games, which frequently suffer from CoC.

Scheduling

Pilot Testing an Artificial Intelligence Algorithm That Selects Homeless Youth Peer Leaders Who Promote HIV Testing

no code implementations19 Aug 2016 Eric Rice, Robin Petering, Jaih Craddock, Amanda Yoshioka-Maxwell, Amulya Yadav, Milind Tambe

To pilot test an artificial intelligence (AI) algorithm that selects peer change agents (PCA) to disseminate HIV testing messaging in a population of homeless youth.

Using Social Networks to Aid Homeless Shelters: Dynamic Influence Maximization under Uncertainty - An Extended Version

no code implementations30 Jan 2016 Amulya Yadav, Hau Chan, Albert Jiang, Haifeng Xu, Eric Rice, Milind Tambe

This paper presents HEALER, a software agent that recommends sequential intervention plans for use by homeless shelters, who organize these interventions to raise awareness about HIV among homeless youth.

Learning Adversary Behavior in Security Games: A PAC Model Perspective

no code implementations30 Oct 2015 Arunesh Sinha, Debarun Kar, Milind Tambe

We provide four main contributions: (1) a PAC model of learning adversary response functions in SSGs; (2) PAC-model analysis of the learning of key, existing bounded rationality models in SSGs; (3) an entirely new approach to adversary modeling based on a non-parametric class of response functions with PAC-model analysis and (4) identification of conditions under which computing the best defender strategy against the learned adversary behavior is indeed the optimal strategy.

Security Games with Information Leakage: Modeling and Computation

no code implementations23 Apr 2015 Haifeng Xu, Albert X. Jiang, Arunesh Sinha, Zinovi Rabinovich, Shaddin Dughmi, Milind Tambe

Our experiments confirm the necessity of handling information leakage and the advantage of our algorithms.

Cannot find the paper you are looking for? You can Submit a new open access paper.