Search Results for author: Steve Marcus

Found 3 papers, 1 papers with code

Bandit algorithms to emulate human decision making using probabilistic distortions

no code implementations30 Nov 2016 Ravi Kumar Kolla, Prashanth L. A., Aditya Gopalan, Krishna Jagannathan, Michael Fu, Steve Marcus

For the $K$-armed bandit setting, we derive an upper bound on the expected regret for our proposed algorithm, and then we prove a matching lower bound to establish the order-optimality of our algorithm.

Decision Making Multi-Armed Bandits

Cumulative Prospect Theory Meets Reinforcement Learning: Prediction and Control

no code implementations8 Jun 2015 Prashanth L. A., Cheng Jie, Michael Fu, Steve Marcus, Csaba Szepesvári

Cumulative prospect theory (CPT) is known to model human decisions well, with substantial empirical evidence supporting this claim.

reinforcement-learning Reinforcement Learning (RL)

Adaptive system optimization using random directions stochastic approximation

1 code implementation19 Feb 2015 Prashanth L. A., Shalabh Bhatnagar, Michael Fu, Steve Marcus

We prove the unbiasedness of both gradient and Hessian estimates and asymptotic (strong) convergence for both first-order and second-order schemes.

Cannot find the paper you are looking for? You can Submit a new open access paper.