Adversarial Text

33 papers with code • 0 benchmarks • 2 datasets

Adversarial Text refers to a specialised text sequence that is designed specifically to influence the prediction of a language model. Generally, Adversarial Text attack are carried out on Large Language Models (LLMs). Research on understanding different adversarial approaches can help us build effective defense mechanisms to detect malicious text input and build robust language models.

Libraries

Use these libraries to find Adversarial Text models and implementations
3 papers
2,747

Arabic Synonym BERT-based Adversarial Examples for Text Classification

norahalshahrani/bert_synonym_attack 5 Feb 2024

To evaluate the grammatical and semantic similarities of the newly produced adversarial examples using our synonym BERT-based attack, we invite four human evaluators to assess and compare the produced adversarial examples with their original examples.

1
05 Feb 2024

RETSim: Resilient and Efficient Text Similarity

chenghaomou/text-dedup 28 Nov 2023

This paper introduces RETSim (Resilient and Efficient Text Similarity), a lightweight, multilingual deep learning model trained to produce robust metric embeddings for near-duplicate text retrieval, clustering, and dataset deduplication tasks.

479
28 Nov 2023

VoteTRANS: Detecting Adversarial Text without Training by Voting on Hard Labels of Transformations

quocnsh/votetrans 2 Jun 2023

Specifically, VoteTRANS detects adversarial text by comparing the hard labels of input text and its transformation.

1
02 Jun 2023

Less is More: Removing Text-regions Improves CLIP Training Efficiency and Robustness

apple/axlearn 8 May 2023

In this paper, we discuss two effective approaches to improve the efficiency and robustness of CLIP training: (1) augmenting the training dataset while maintaining the same number of optimization steps, and (2) filtering out samples that contain text regions in the image.

861
08 May 2023

A Pilot Study of Query-Free Adversarial Attack against Stable Diffusion

optml-group/qf-attack 29 Mar 2023

In this work, we study the problem of adversarial attack generation for Stable Diffusion and ask if an adversarial text prompt can be obtained even in the absence of end-to-end model queries.

14
29 Mar 2023

Frauds Bargain Attack: Generating Adversarial Text Samples via Word Manipulation Process

mingzelucasni/fba 1 Mar 2023

In response, this study proposes a new method called the Fraud's Bargain Attack (FBA), which uses a randomization mechanism to expand the search space and produce high-quality adversarial examples with a higher probability of success.

3
01 Mar 2023

RETVec: Resilient and Efficient Text Vectorizer

google-research/retvec NeurIPS 2023

The RETVec embedding model is pre-trained using pair-wise metric learning to be robust against typos and character-level adversarial attacks.

267
18 Feb 2023

Step by Step Loss Goes Very Far: Multi-Step Quantization for Adversarial Text Attacks

gmum/mango 10 Feb 2023

We propose a novel gradient-based attack against transformer-based language models that searches for an adversarial example in a continuous space of token probabilities.

2
10 Feb 2023

RIATIG: Reliable and Imperceptible Adversarial Text-to-Image Generation With Natural Prompts

wustl-cspl/riatig CVPR 2023

The field of text-to-image generation has made remarkable strides in creating high-fidelity and photorealistic images.

14
01 Jan 2023

Ignore Previous Prompt: Attack Techniques For Language Models

agencyenterprise/promptinject 17 Nov 2022

Transformer-based large language models (LLMs) provide a powerful foundation for natural language tasks in large-scale customer-facing applications.

261
17 Nov 2022