Search Results for author: Steve Marcus

Found 3 papers, 1 papers with code

Bandit algorithms to emulate human decision making using probabilistic distortions

no code implementations • 30 Nov 2016 • Ravi Kumar Kolla, Prashanth L. A., Aditya Gopalan, Krishna Jagannathan, Michael Fu, Steve Marcus

For the $K$-armed bandit setting, we derive an upper bound on the expected regret for our proposed algorithm, and then we prove a matching lower bound to establish the order-optimality of our algorithm.

Decision Making Multi-Armed Bandits

Paper
Add Code

Cumulative Prospect Theory Meets Reinforcement Learning: Prediction and Control

no code implementations • 8 Jun 2015 • Prashanth L. A., Cheng Jie, Michael Fu, Steve Marcus, Csaba Szepesvári

Cumulative prospect theory (CPT) is known to model human decisions well, with substantial empirical evidence supporting this claim.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Adaptive system optimization using random directions stochastic approximation

1 code implementation • 19 Feb 2015 • Prashanth L. A., Shalabh Bhatnagar, Michael Fu, Steve Marcus

We prove the unbiasedness of both gradient and Hessian estimates and asymptotic (strong) convergence for both first-order and second-order schemes.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.