Search Results for author: Naomi Ehrich Leonard

Found 13 papers, 1 papers with code

Distributed Bandits: Probabilistic Communication on $d$-regular Graphs

no code implementations16 Nov 2020 Udari Madhushani, Naomi Ehrich Leonard

Every edge in the graph has probabilistic weight $p$ to account for the ($1\!-\! p$) probability of a communication link failure.

Hamiltonian Q-Learning: Leveraging Importance-sampling for Data Efficient RL

no code implementations11 Nov 2020 Udari Madhushani, Biswadip Dey, Naomi Ehrich Leonard, Amit Chakraborty

By providing an efficient way to apply Q-learning in stochastic, high-dimensional problems, the proposed approach broadens the scope of RL algorithms for real-world applications, including classical control tasks and environmental monitoring.

Matrix Completion Q-Learning

LagNetViP: A Lagrangian Neural Network for Video Prediction

no code implementations24 Oct 2020 Christine Allen-Blanchette, Sushant Veer, Anirudha Majumdar, Naomi Ehrich Leonard

In this paper, we introduce a video prediction model where the equations of motion are explicitly constructed from learned representations of the underlying physical quantities.

Acrobot Video Prediction

Unsupervised Learning of Lagrangian Dynamics from Images for Prediction and Control

1 code implementation NeurIPS 2020 Yaofeng Desmond Zhong, Naomi Ehrich Leonard

The VAE is designed to account for the geometry of physical systems composed of multiple rigid bodies in the plane.

Distributed Learning: Sequential Decision Making in Resource-Constrained Environments

no code implementations13 Apr 2020 Udari Madhushani, Naomi Ehrich Leonard

We study cost-effective communication strategies that can be used to improve the performance of distributed learning systems in resource-constrained environments.

Decision Making

A Dynamic Observation Strategy for Multi-agent Multi-armed Bandit Problem

no code implementations8 Apr 2020 Udari Madhushani, Naomi Ehrich Leonard

We define and analyze a multi-agent multi-armed bandit problem in which decision-making agents can observe the choices and rewards of their neighbors under a linear observation cost.

Decision Making

Distributed Cooperative Decision Making in Multi-agent Multi-armed Bandits

no code implementations3 Mar 2020 Peter Landgren, Vaibhav Srivastava, Naomi Ehrich Leonard

And we consider a constrained reward model in which agents that choose the same arm at the same time receive no reward.

Decision Making Multi-Armed Bandits

Heterogeneous Stochastic Interactions for Multiple Agents in a Multi-armed Bandit Problem

no code implementations21 May 2019 Udari Madhushani, Naomi Ehrich Leonard

We define and analyze a multi-agent multi-armed bandit problem in which decision-making agents can observe the choices and rewards of their neighbors.

Decision Making

Distributed Cooperative Decision-Making in Multiarmed Bandits: Frequentist and Bayesian Algorithms

no code implementations2 Jun 2016 Peter Landgren, Vaibhav Srivastava, Naomi Ehrich Leonard

We study distributed cooperative decision-making under the explore-exploit tradeoff in the multiarmed bandit (MAB) problem.

Decision Making

Satisficing in multi-armed bandit problems

no code implementations23 Dec 2015 Paul Reverdy, Vaibhav Srivastava, Naomi Ehrich Leonard

Satisficing is a relaxation of maximizing and allows for less risky decision making in the face of uncertainty.

Decision Making

On Distributed Cooperative Decision-Making in Multiarmed Bandits

no code implementations21 Dec 2015 Peter Landgren, Vaibhav Srivastava, Naomi Ehrich Leonard

We study the explore-exploit tradeoff in distributed cooperative decision-making using the context of the multiarmed bandit (MAB) problem.

Decision Making

Correlated Multiarmed Bandit Problem: Bayesian Algorithms and Regret Analysis

no code implementations5 Jul 2015 Vaibhav Srivastava, Paul Reverdy, Naomi Ehrich Leonard

We consider the correlated multiarmed bandit (MAB) problem in which the rewards associated with each arm are modeled by a multivariate Gaussian random variable, and we investigate the influence of the assumptions in the Bayesian prior on the performance of the upper credible limit (UCL) algorithm and a new correlated UCL algorithm.

Decision Making

Cooperative learning in multi-agent systems from intermittent measurements

no code implementations11 Sep 2012 Naomi Ehrich Leonard, Alex Olshevsky

Motivated by the problem of tracking a direction in a decentralized way, we consider the general problem of cooperative learning in multi-agent systems with time-varying connectivity and intermittent measurements.

Cannot find the paper you are looking for? You can Submit a new open access paper.