no code implementations • 30 Nov 2016 • Ravi Kumar Kolla, Prashanth L. A., Aditya Gopalan, Krishna Jagannathan, Michael Fu, Steve Marcus
For the $K$-armed bandit setting, we derive an upper bound on the expected regret for our proposed algorithm, and then we prove a matching lower bound to establish the order-optimality of our algorithm.
no code implementations • 8 Jun 2015 • Prashanth L. A., Cheng Jie, Michael Fu, Steve Marcus, Csaba Szepesvári
Cumulative prospect theory (CPT) is known to model human decisions well, with substantial empirical evidence supporting this claim.
1 code implementation • 19 Feb 2015 • Prashanth L. A., Shalabh Bhatnagar, Michael Fu, Steve Marcus
We prove the unbiasedness of both gradient and Hessian estimates and asymptotic (strong) convergence for both first-order and second-order schemes.