no code implementations • 15 Sep 2024 • Sahil Kuchlous, Marvin Li, Jeffrey G. Wang
By specifically adapting the perspective of embedding spaces, we establish new fairness conditions for diffusion model development and evaluation.
no code implementations • 3 Mar 2024 • Marvin Li, Sitan Chen
Additionally, preliminary experiments on Stable Diffusion suggest critical windows may serve as a useful tool for diagnosing fairness and privacy violations in real-world diffusion models.
1 code implementation • 26 Feb 2024 • Jeffrey G. Wang, Jason Wang, Marvin Li, Seth Neel
In fine-tuning, we find that a simple attack based on the ratio of the loss between the base and fine-tuned models is able to achieve near-perfect MIA performance; we then leverage our MIA to extract a large fraction of the fine-tuning dataset from fine-tuned Pythia and Llama models.
no code implementations • 22 Oct 2023 • Marvin Li, Jason Wang, Jeffrey Wang, Seth Neel
In this paper, we present Model Perturbations (MoPe), a new method to identify with high confidence if a given text is in the training data of a pre-trained language model, given white-box access to the models parameters.