Adversarial Text

33 papers with code • 0 benchmarks • 2 datasets

Adversarial Text refers to a specialised text sequence that is designed specifically to influence the prediction of a language model. Generally, Adversarial Text attack are carried out on Large Language Models (LLMs). Research on understanding different adversarial approaches can help us build effective defense mechanisms to detect malicious text input and build robust language models.

Benchmarks

Add a Result

These leaderboards are used to track progress in Adversarial Text

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Libraries

Use these libraries to find Adversarial Text models and implementations

QData/TextAttack

3 papers

2,753

Datasets

Latest papers with no code

Most implemented Social Latest No code

Improved Training of Mixture-of-Experts Language GANs

no code yet • 23 Feb 2023

In this work, we (1) first empirically show that the mixture-of-experts approach is able to enhance the representation capacity of the generator for language GANs and (2) harness the Feature Statistics Alignment (FSA) paradigm to render fine-grained learning signals to advance the generator training.

Paper
Add Code

TextDefense: Adversarial Text Detection based on Word Importance Entropy

no code yet • 12 Feb 2023

TextDefense differs from previous approaches, where it utilizes the target model for detection and thus is attack type agnostic.

Paper
Add Code

A survey on text generation using generative adversarial networks

no code yet • 20 Dec 2022

This work presents a thorough review concerning recent studies and text generation advancements using Generative Adversarial Networks.

Paper
Add Code

Adversarial Text Normalization

no code yet • NAACL (ACL) 2022

Additionally, the process to retrain a model is time and resource intensive, creating a need for a lightweight, reusable defense.

Paper
Add Code

Data-Driven Mitigation of Adversarial Text Perturbation

no code yet • 19 Feb 2022

We propose Continuous Word2Vec (CW2V), our data-driven method to learn word embeddings that ensures that perturbations of words have embeddings similar to those of the original words.

Paper
Add Code

Identifying Adversarial Attacks on Text Classifiers

no code yet • 21 Jan 2022

The landscape of adversarial attacks against text classifiers continues to grow, with new attacks developed every year and many of them available in standard toolkits, such as TextAttack and OpenAttack.

Paper
Add Code

SemAttack: Natural Textual Attacks via Different Semantic Spaces

no code yet • ACL ARR January 2022

In particular, SemAttack optimizes the generated perturbations constrained on generic semantic spaces, including typo space, knowledge space (e. g., WordNet), contextualized semantic space (e. g., the embedding space of BERT clusterings), or the combination of these spaces.

Paper
Add Code

Repairing Adversarial Texts through Perturbation

no code yet • 29 Dec 2021

Furthermore, such attacks are impossible to eliminate, i. e., the adversarial perturbation is still possible after applying mitigation methods such as adversarial training.

Paper
Add Code

"That Is a Suspicious Reaction!": Interpreting Logits Variation to Detect NLP Adversarial Attacks

no code yet • ACL ARR November 2021

Adversarial attacks are a major challenge faced by current machine learning research.

Paper
Add Code

Generating Watermarked Adversarial Texts

no code yet • 25 Oct 2021

Adversarial example generation has been a hot spot in recent years because it can cause deep neural networks (DNNs) to misclassify the generated adversarial examples, which reveals the vulnerability of DNNs, motivating us to find good solutions to improve the robustness of DNN models.

Paper
Add Code

Adversarial Text

Benchmarks Add a Result

Libraries

Datasets

Latest papers with no code

Content

Benchmarks

Add a Result