Search Results for author: Aly M. Kassem

Found 3 papers, 1 papers with code

Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs

1 code implementation • 5 Mar 2024 • Aly M. Kassem, Omar Mahmoud, Niloofar Mireshghallah, Hyunwoo Kim, Yulia Tsvetkov, Yejin Choi, Sherif Saad, Santu Rana

In this paper, we introduce a black-box prompt optimization method that uses an attacker LLM agent to uncover higher levels of memorization in a victim agent, compared to what is revealed by prompting the target model with the training data directly, which is the dominant approach of quantifying memorization in LLMs.

Memorization

Paper
Code

Finding a Needle in the Adversarial Haystack: A Targeted Paraphrasing Approach For Uncovering Edge Cases with Minimal Distribution Distortion

no code implementations • 21 Jan 2024 • Aly M. Kassem, Sherif Saad

TPRL leverages FLAN T5, a language model, as a generator and employs a self learned policy using a proximal policy gradient to generate the adversarial examples automatically.

Language Modelling

Paper
Add Code

Mitigating Approximate Memorization in Language Models via Dissimilarity Learned Policy

no code implementations • 2 May 2023 • Aly M. Kassem

However, these methods rely on explicit and implicit assumptions about the structure of the data to be protected, which often results in an incomplete solution to the problem.

Memorization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.