Automated Discovery of Local Rules for Desired Collective-Level Behavior Through Reinforcement Learning

25 Jul 2020 · Tiago Costa, Andres Laan, Francisco J. H. Heras, Gonzalo G. de Polavieja ·

Complex global behavior patterns can emerge from very simple local interactionsbetween many agents. However, no local interaction rules have been identified thatgenerate some patterns observed in nature, for example the rotating balls, rotatingtornadoes and the full-core rotating mills observed in fish collectives. Here we show thatlocally interacting agents modeled with a minimal cognitive system can produce thesecollective patterns. We obtained this result by using recent advances in reinforcementlearning to systematically solve the inverse modeling problem: given an observedcollective behavior, we automatically find a policy generating it. Our agents are modeledas processing the information from neighbor agents to choose actions with a neuralnetwork and move in an environment of simulated physics. Even though every agent isequipped with its own neural network, all agents have the same network architectureand parameter values, ensuring in this way that a single policy is responsible for theemergence of a given pattern. We find the final policies by tuning the neural networkweights until the produced collective behavior approachesthe desired one. By usingmodular neural networks with modules using a small number ofinputs and outputs,we built an interpretable model of collective motion. This enabled us to analyse thepolicies obtained. We found a similar general structure forthe four different collectivepatterns, not dissimilar to the one we have previously inferred from experimental zebrafishtrajectories; but we also found consistent differences between policies generating thedifferent collective pattern, for example repulsion in thevertical direction for the morethree-dimensional structures of the sphere and tornado. Our results illustrate how newadvances in artificial intelligence, and specifically in reinforcement learning, allow newapproaches to analysis and modeling of collective behavior.

PDF Abstract