Search Results for author: Yiran Ye

Found 1 papers, 0 papers with code

NoisyHate: Benchmarking Content Moderation Machine Learning Models with Human-Written Perturbations Online

no code implementations • 18 Mar 2023 • Yiran Ye, Thai Le, Dongwon Lee

In this paper, we introduce a benchmark test set containing human-written perturbations online for toxic speech detection models.

Adversarial Attack Benchmarking +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.