We here propose to use copyright traps, the inclusion of fictitious entries in original content, to detect the use of copyrighted materials in LLMs with a focus on models where memorization does not naturally occur.
First, we propose a procedure for the development and evaluation of document-level membership inference for LLMs by leveraging commonly used data sources for training and the model release date.
While membership inference attacks (MIAs), based on shadow modeling, have become the standard to evaluate the privacy of synthetic data, they currently assume the attacker to have access to an auxiliary dataset sampled from a similar distribution as the training dataset.
The choice of vulnerable records is as important as more accurate MIAs when evaluating the privacy of synthetic data releases, including from a legal perspective.
Directly extending the shadow modelling technique from the black-box to the white-box setting has been shown, in general, not to perform better than black-box only attacks.
In this paper, we introduce the generic moment-to-moment (M$^2$M) method to perform a wide range of data exploration tasks from a single private sketch.
We show the attacks found by QS to consistently equate or outperform, sometimes by a large margin, the best attacks from the literature.
Second, we propose a model-based attack, showing how an attacker can exploit black-box access to the model to infer the correlations using shadow models trained on synthetic datasets.
Mobile phone metadata is increasingly used for humanitarian purposes in developing countries as traditional data is scarce.