Strategically-timed State-Observation Attacks on Deep Reinforcement Learning Agents

Deep reinforcement learning (DRL) policies are vulnerable to the adversarial attack on their observations, which may mislead real-world RL agents to catastrophic failures. Several works have shown the effectiveness of this type of adversarial attacks. But these adversaries are inclined to be detected because these adversaries do not inhibit their attacks activity. Recent works provide heuristic methods by attacking the victim agent at a small subset of time steps, but it aims at lack for theoretical principles. Inspired by the idea that adversarial attacks at each time step have different efforts, we denote a novel strategically-timed attack called Tentative Frame Attack for continuous control environments. We further propose a theoretical framework of finding optimal frame attack. Following this framework, we trained the frame attack strategy online with the victim agents and a fixed adversary. The empirical results show that our adversaries achieve the state-of-the-art performance on DRL agents which outperforms the full-timed attack.

PDF Abstract

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here