Lotka-Volterra competition mechanism embedded in a decision-making method
Decision making is a fundamental capability of living organisms, and has recently been gaining increasing importance in many engineering applications. Here, we consider a simple decision-making principle to identify an optimal choice in multi-armed bandit (MAB) problems, which is fundamental in the context of reinforcement learning. We demonstrate that the identification mechanism of the method is well described by using a competitive ecosystem model, i.e., the competitive Lotka--Volterra (LV) model. Based on the "winner-take-all" mechanism in the competitive LV model, we demonstrate that non-best choices are eliminated and only the best choice survives; the failure of the non-best choices exponentially decreases while repeating the choice trials. Furthermore, we apply a mean-field approximation to the proposed decision-making method and show that the method has an excellent scalability of $O(\log N)$ with respect to the number of choices $N$. These results allow for a new perspective on optimal search capabilities in competitive systems.
PDF Abstract