Search Results for author: Wencong You

Found 5 papers, 1 papers with code

Towards Stronger Adversarial Baselines Through Human-AI Collaboration

no code implementations nlppower (ACL) 2022 Wencong You, Daniel Lowd

We propose to combine human and AI expertise in generating adversarial examples, benefiting from humans’ expertise in language and automated attacks’ ability to probe the target system more quickly and thoroughly.

Large Language Models Are Better Adversaries: Exploring Generative Clean-Label Backdoor Attacks Against Text Classifiers

no code implementations28 Oct 2023 Wencong You, Zayd Hammoudeh, Daniel Lowd

Backdoor attacks manipulate model predictions by inserting innocuous triggers into training and test data.

TCAB: A Large-Scale Text Classification Attack Benchmark

1 code implementation21 Oct 2022 Kalyani Asthana, Zhouhang Xie, Wencong You, Adam Noack, Jonathan Brophy, Sameer Singh, Daniel Lowd

In addition to the primary tasks of detecting and labeling attacks, TCAB can also be used for attack localization, attack target labeling, and attack characterization.

Abuse Detection Sentiment Analysis +2

Identifying Adversarial Attacks on Text Classifiers

no code implementations21 Jan 2022 Zhouhang Xie, Jonathan Brophy, Adam Noack, Wencong You, Kalyani Asthana, Carter Perkins, Sabrina Reis, Sameer Singh, Daniel Lowd

The landscape of adversarial attacks against text classifiers continues to grow, with new attacks developed every year and many of them available in standard toolkits, such as TextAttack and OpenAttack.

Abuse Detection Adversarial Text +2

Cannot find the paper you are looking for? You can Submit a new open access paper.