no code implementations • 23 Jul 2013 • Paul Reverdy, Vaibhav Srivastava, Naomi E. Leonard
We develop the upper credible limit (UCL) algorithm for the standard multi-armed bandit problem and show that this deterministic algorithm achieves logarithmic cumulative expected regret, which is optimal performance for uninformative priors.
no code implementations • 5 Jul 2015 • Vaibhav Srivastava, Paul Reverdy, Naomi Ehrich Leonard
We consider the correlated multiarmed bandit (MAB) problem in which the rewards associated with each arm are modeled by a multivariate Gaussian random variable, and we investigate the influence of the assumptions in the Bayesian prior on the performance of the upper credible limit (UCL) algorithm and a new correlated UCL algorithm.
1 code implementation • NeurIPS 2015 • Michael Shvartsman, Vaibhav Srivastava, Jonathan D. Cohen
We also show how the model generalizes re- cent work on the control of attention in the Flanker task (Yu et al., 2009).
no code implementations • 21 Dec 2015 • Peter Landgren, Vaibhav Srivastava, Naomi Ehrich Leonard
We study the explore-exploit tradeoff in distributed cooperative decision-making using the context of the multiarmed bandit (MAB) problem.
no code implementations • 23 Dec 2015 • Paul Reverdy, Vaibhav Srivastava, Naomi Ehrich Leonard
Satisficing is a relaxation of maximizing and allows for less risky decision making in the face of uncertainty.
no code implementations • 2 Jun 2016 • Peter Landgren, Vaibhav Srivastava, Naomi Ehrich Leonard
We study distributed cooperative decision-making under the explore-exploit tradeoff in the multiarmed bandit (MAB) problem.
no code implementations • 23 Feb 2018 • Lai Wei, Vaibhav Srivastava
We study the non-stationary stochastic multiarmed bandit (MAB) problem and propose two generic algorithms, namely, the limited memory deterministic sequencing of exploration and exploitation (LM-DSEE) and the Sliding-Window Upper Confidence Bound# (SW-UCB#).
no code implementations • 12 Dec 2018 • Lai Wei, Vaibhav Srivastava
We study the multi-player stochastic multiarmed bandit (MAB) problem in an abruptly changing environment.
no code implementations • 3 Mar 2020 • Peter Landgren, Vaibhav Srivastava, Naomi Ehrich Leonard
And we consider a constrained reward model in which agents that choose the same arm at the same time receive no reward.
no code implementations • 18 May 2020 • Lai Wei, Xiaobo Tan, Vaibhav Srivastava
Based on the sensing model, we design a novel algorithm called Expedited Multi-Target Search (EMTS) that (i) addresses the coverage-accuracy trade-off: sampling at locations farther from the floor provides wider field of view but less accurate measurements, (ii) computes an occupancy map of the floor within a prescribed accuracy and quickly eliminates unoccupied regions from the search space, and (iii) travels efficiently to collect the required samples for target detection.
no code implementations • 20 Jul 2020 • Lai Wei, Vaibhav Srivastava
We study the stochastic Multi-Armed Bandit (MAB) problem under worst-case regret and heavy-tailed reward distribution.
no code implementations • 12 Jan 2021 • Lai Wei, Andrew McDonald, Vaibhav Srivastava
Modeling the sensory field as a realization of a Gaussian Process and using Bayesian techniques, we devise a policy which aims to balance the tradeoff between learning the sensory function and covering the environment.
no code implementations • 22 Jan 2021 • Lai Wei, Vaibhav Srivastava
We study the nonstationary stochastic Multi-Armed Bandit (MAB) problem in which the distribution of rewards associated with each arm are assumed to be time-varying and the total variation in the expected rewards is subject to a variation budget.
no code implementations • 18 Mar 2021 • Connor J. Boss, Vaibhav Srivastava
We study an in-flight actuator failure recovery problem for a hexrotor UAV.
no code implementations • 24 Mar 2021 • Connor J Boss, Vaibhav Srivastava
We study the problem of estimating and tracking an unknown trajectory with a multi-rotor UAV in the presence of modeling error and external disturbances.
no code implementations • 20 Jun 2021 • Nan Li, Kaixiang Zhang, Zhaojian Li, Vaibhav Srivastava, Xiang Yin
In this paper, we propose a novel cloud-assisted model predictive control (MPC) framework in which we systematically fuse a cloud MPC that uses a high-fidelity nonlinear model but is subject to communication delays with a local MPC that exploits simplified dynamics (due to limited computation) but has timely feedback.
1 code implementation • 28 Jun 2021 • Andrew McDonald, Lai Wei, Vaibhav Srivastava
In this paper, we address the problem of multi-robot online estimation and coverage control by combining low- and high-fidelity data to learn and cover a sensory function of interest.
no code implementations • 6 Feb 2022 • Ankur Kamboj, Rajiv Ranganathan, Xiaobo Tan, Vaibhav Srivastava
Designing effective rehabilitation strategies for upper extremities, particularly hands and fingers, warrants the need for a computational model of human motor learning.
no code implementations • 19 Mar 2022 • Abhisek Satapathi, Narendra Kumar Dhar, Ashish R. Hota, Vaibhav Srivastava
We consider the class of SIS epidemic models in which a large population of individuals chooses whether to adopt protection or to remain unprotected as the epidemic evolves.
no code implementations • 12 Sep 2022 • Piyush Gupta, Vaibhav Srivastava
During exploration, DSEE explores the environment and updates the estimates for expected reward and transition probabilities.
no code implementations • 15 Sep 2022 • Abhisek Satapathi, Narendra Kumar Dhar, Ashish R. Hota, Vaibhav Srivastava
We study the interplay between epidemic dynamics and human decision making for epidemics that involve reinfection risk; in particular, the susceptible-infected-susceptible (SIS) and the susceptible-infected-recovered-infected (SIRI) epidemic models.
no code implementations • 21 Apr 2023 • Ethan Lau, Vaibhav Srivastava, Shaunak D. Bopardikar
Safely controlling unknown dynamical systems is one of the biggest challenges in the field of control.
no code implementations • 20 Apr 2024 • Ankur Kamboj, Rajiv Ranganathan, Xiaobo Tan, Vaibhav Srivastava
Conventional approaches to enhancing movement coordination, such as providing instructions and visual feedback, are often inadequate in complex motor tasks with multiple degrees of freedom (DoFs).