Search Results for author: Udari Madhushani

Found 14 papers, 1 papers with code

Melting Pot 2.0

2 code implementations24 Nov 2022 John P. Agapiou, Alexander Sasha Vezhnevets, Edgar A. Duéñez-Guzmán, Jayd Matyas, Yiran Mao, Peter Sunehag, Raphael Köster, Udari Madhushani, Kavya Kopparapu, Ramona Comanescu, DJ Strouse, Michael B. Johanson, Sukhdeep Singh, Julia Haas, Igor Mordatch, Dean Mobbs, Joel Z. Leibo

Melting Pot is a research tool developed to facilitate work on multi-agent artificial intelligence, and provides an evaluation protocol that measures generalization to novel social partners in a set of canonical test scenarios.

Artificial Life Navigate

A Regret Minimization Approach to Multi-Agent Control

no code implementations28 Jan 2022 Udaya Ghai, Udari Madhushani, Naomi Leonard, Elad Hazan

We study the problem of multi-agent control of a dynamical system with known dynamics and adversarial disturbances.

One More Step Towards Reality: Cooperative Bandits with Imperfect Communication

no code implementations NeurIPS 2021 Udari Madhushani, Abhimanyu Dubey, Naomi Ehrich Leonard, Alex Pentland

However, most research for this problem focuses exclusively on the setting with perfect communication, whereas in most real-world distributed settings, communication is often over stochastic networks, with arbitrary corruptions and delays.

Decision Making

When to Call Your Neighbor? Strategic Communication in Cooperative Stochastic Bandits

no code implementations8 Oct 2021 Udari Madhushani, Naomi Leonard

We propose \textit{ComEx}, a novel cost-effective communication protocol in which the group achieves the same order of performance as full communication while communicating only $O(\log T)$ number of messages.

Decision Making

Distributed Bandits: Probabilistic Communication on $d$-regular Graphs

no code implementations16 Nov 2020 Udari Madhushani, Naomi Ehrich Leonard

Every edge in the graph has probabilistic weight $p$ to account for the ($1\!-\! p$) probability of a communication link failure.

On Using Hamiltonian Monte Carlo Sampling for Reinforcement Learning Problems in High-dimension

no code implementations11 Nov 2020 Udari Madhushani, Biswadip Dey, Naomi Ehrich Leonard, Amit Chakraborty

Value function based reinforcement learning (RL) algorithms, for example, $Q$-learning, learn optimal policies from datasets of actions, rewards, and state transitions.

Matrix Completion Q-Learning +2

It Doesn’t Get Better and Here’s Why: A Fundamental Drawback in Natural Extensions of UCB to Multi-agent Bandits

no code implementations NeurIPS Workshop ICBINB 2020 Udari Madhushani, Naomi Leonard

We identify a fundamental drawback of natural extensions of Upper Confidence Bound (UCB) algorithms to the multi-agent bandit problem in which multiple agents facing the same explore-exploit problem can share information.

Heterogeneous Explore-Exploit Strategies on Multi-Star Networks

no code implementations2 Sep 2020 Udari Madhushani, Naomi Leonard

To do so we study a class of distributed stochastic bandit problems in which agents communicate over a multi-star network and make sequential choices among options in the same uncertain environment.

Decision Making

Distributed Learning: Sequential Decision Making in Resource-Constrained Environments

no code implementations13 Apr 2020 Udari Madhushani, Naomi Ehrich Leonard

We study cost-effective communication strategies that can be used to improve the performance of distributed learning systems in resource-constrained environments.

Decision Making

A Dynamic Observation Strategy for Multi-agent Multi-armed Bandit Problem

no code implementations8 Apr 2020 Udari Madhushani, Naomi Ehrich Leonard

We define and analyze a multi-agent multi-armed bandit problem in which decision-making agents can observe the choices and rewards of their neighbors under a linear observation cost.

Decision Making

Heterogeneous Stochastic Interactions for Multiple Agents in a Multi-armed Bandit Problem

no code implementations21 May 2019 Udari Madhushani, Naomi Ehrich Leonard

We define and analyze a multi-agent multi-armed bandit problem in which decision-making agents can observe the choices and rewards of their neighbors.

Decision Making

Cannot find the paper you are looking for? You can Submit a new open access paper.