no code implementations • 23 Dec 2015 • Paul Reverdy, Vaibhav Srivastava, Naomi Ehrich Leonard
Satisficing is a relaxation of maximizing and allows for less risky decision making in the face of uncertainty.
no code implementations • 5 Jul 2015 • Vaibhav Srivastava, Paul Reverdy, Naomi Ehrich Leonard
We consider the correlated multiarmed bandit (MAB) problem in which the rewards associated with each arm are modeled by a multivariate Gaussian random variable, and we investigate the influence of the assumptions in the Bayesian prior on the performance of the upper credible limit (UCL) algorithm and a new correlated UCL algorithm.
no code implementations • 16 Feb 2015 • Paul Reverdy, Naomi E. Leonard
With an eye towards human-centered automation, we contribute to the development of a systematic means to infer features of human decision-making from behavioral data.
no code implementations • 23 Jul 2013 • Paul Reverdy, Vaibhav Srivastava, Naomi E. Leonard
We develop the upper credible limit (UCL) algorithm for the standard multi-armed bandit problem and show that this deterministic algorithm achieves logarithmic cumulative expected regret, which is optimal performance for uninformative priors.