no code implementations • 18 Mar 2023 • Yiran Ye, Thai Le, Dongwon Lee
In this paper, we introduce a benchmark test set containing human-written perturbations online for toxic speech detection models.
Adversarial Attack Benchmarking +1