Conversely, active inference reduces to Bayesian decision theory in the absence of ambiguity and relative risk, i. e., expected utility maximization.
Under the Bayesian brain hypothesis, behavioural variations can be attributed to different priors over generative model parameters.
In this paper, we pursue the notion that this learnt behaviour can be a consequence of reward-free preference learning that ensures an appropriate trade-off between exploration and preference satisfaction.
In this paper, we consider the relation between active inference and dynamic programming under the Bellman equation, which underlies many approaches to reinforcement learning and control.
In a more complex Animal-AI environment, our agents (using the same neural architecture) are able to simulate future state transitions and actions (i. e., plan), to evince reward-directed navigation - despite temporary suspension of visual input.
In this paper, we provide: 1) an accessible overview of the discrete-state formulation of active inference, highlighting natural behaviors in active inference that are generally engineered in RL; 2) an explicit discrete-state comparison between active inference and RL on an OpenAI gym baseline.