Search Results for author: Aly M. Kassem

Found 3 papers, 1 papers with code

Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs

1 code implementation5 Mar 2024 Aly M. Kassem, Omar Mahmoud, Niloofar Mireshghallah, Hyunwoo Kim, Yulia Tsvetkov, Yejin Choi, Sherif Saad, Santu Rana

In this paper, we introduce a black-box prompt optimization method that uses an attacker LLM agent to uncover higher levels of memorization in a victim agent, compared to what is revealed by prompting the target model with the training data directly, which is the dominant approach of quantifying memorization in LLMs.

Memorization

Finding a Needle in the Adversarial Haystack: A Targeted Paraphrasing Approach For Uncovering Edge Cases with Minimal Distribution Distortion

no code implementations21 Jan 2024 Aly M. Kassem, Sherif Saad

TPRL leverages FLAN T5, a language model, as a generator and employs a self learned policy using a proximal policy gradient to generate the adversarial examples automatically.

Language Modelling

Mitigating Approximate Memorization in Language Models via Dissimilarity Learned Policy

no code implementations2 May 2023 Aly M. Kassem

However, these methods rely on explicit and implicit assumptions about the structure of the data to be protected, which often results in an incomplete solution to the problem.

Memorization

Cannot find the paper you are looking for? You can Submit a new open access paper.