no code implementations • 15 Feb 2024 • Yinglun Xu, Rohan Gumaste, Gagandeep Singh
To the best of our knowledge, we propose the first black-box reward poisoning attack in the general offline RL setting.
no code implementations • 30 Dec 2023 • Yinglun Xu, Gagandeep Singh
Our method ignores such state-actions during the second learning phase to achieve higher learning efficiency.
no code implementations • 15 Jul 2023 • Yinglun Xu, Bhuvesh Kumar, Jacob Abernethy
Efficient learning in multi-armed bandit mechanisms such as pay-per-click (PPC) auctions typically involves three challenges: 1) inducing truthful bidding behavior (incentives), 2) using personalization in the users (context), and 3) circumventing manipulations in click patterns (corruptions).
no code implementations • 18 May 2023 • Yinglun Xu, Gagandeep Singh
We leverage a general framework and find conditions to ensure efficient attack under a general assumption of the learning algorithms.
1 code implementation • 30 May 2022 • Yinglun Xu, Qi Zeng, Gagandeep Singh
We study reward poisoning attacks on online deep reinforcement learning (DRL), where the attacker is oblivious to the learning algorithm used by the agent and the dynamics of the environment.
no code implementations • NeurIPS 2021 • Yinglun Xu, Bhuvesh Kumar, Jacob D. Abernethy
To the best of our knowledge, we develop the first data corruption attack on stochastic multi arm bandit algorithms which works without observing the algorithm's realized behavior.