The algorithm is evaluated in realistic safety-critical environments with non-stationary disturbances.
We propose a model-based approach to enable RL agents to effectively explore the environment with unknown system dynamics and environment constraints given a significantly small number of violation budgets.
Existing neural network-based autonomous systems are shown to be vulnerable against adversarial attacks, therefore sophisticated evaluation on their robustness is of great importance.
Multi-agent navigation in dynamic environments is of great industrial value when deploying a large scale fleet of robot to real-world applications.
We propose a transition prior to account for the temporal dependencies in streaming data and update the mixture online via sequential variational inference.
We also test the proposed algorithm in traffic scenarios that require coordination of all autonomous vehicles to show the practical value of delay-awareness.
Action delays degrade the performance of reinforcement learning in many real-world systems.
Results show that the adversarial scenarios generated by our method significantly degrade the performance of the tested vehicles.
We then train the generative model as an agent (or a generator) to investigate the risky distribution parameters for a given driving algorithm being evaluated.