We present an algorithm for formal verification and parameter synthesis of continuous state-space Markov chains. This class of problems captures the design and analysis of a wide variety of autonomous and cyber-physical systems defined by nonlinear and black-box modules. In order to solve these problems, one has to maximize certain probabilistic objective functions overall choices of initial states and parameters. In this paper, we identify the assumptions that make it possible to view this problem as a multi-armed bandit problem. Based on this fresh perspective, we propose an algorithm (HOO-MB) for solving the problem that carefully instantiates an existing bandit algorithm -- Hierarchical Optimistic Optimization -- with appropriate parameters. As a consequence, we obtain theoretical regret bounds on sample efficiency of our solution that depends on key problem parameters like smoothness, near-optimality dimension, and batch size. The batch size parameter enables us to strike a balance between the sample efficiency and the memory usage of the algorithm. Our experiments, using the tool HooVer, suggest that the approach scales to realistic-sized problems and is often more sample-efficient compared to PlasmaLab -- a leading tool for verification of stochastic systems. Specifically, HooVer has distinct advantages in analyzing models in which the objective function has sharp slopes. In addition, HooVer shows promising behavior in parameter synthesis for a linear quadratic regulator (LQR) example.