We present faster polynomial-time Monte-Carlo algorithms for finidng the fixation probability on undirected graphs.
For example, in a concurrent system of two components, the traditional approach requires hexic time in the worst case for answering one query as well as computing the transitive closure, whereas we show that with one-time preprocessing in almost cubic time, each subsequent query can be answered in at most linear time, and even the transitive closure can be computed in almost quartic time.
Programming Languages Data Structures and Algorithms F.3.2
We consider the problem of minimizing the regret in stochastic multi-armed bandit, when the measure of goodness of an arm is not the mean return, but some general function of the mean and the variance. We characterize the conditions under which learning is possible and present examples for which no natural algorithm can achieve sublinear regret.