no code implementations • 13 Dec 2023 • Tanya Akumu, Celia Cintas, Girmaw Abebe Tadesse, Adebayo Oshingbesan, Skyler Speakman, Edward McFowland III
The representations of the activation space of deep neural networks (DNNs) are widely utilized for tasks like natural language processing, anomaly detection and speech recognition.
1 code implementation • 5 Dec 2023 • Miriam Rateike, Celia Cintas, John Wamburu, Tanya Akumu, Skyler Speakman
We introduce a weakly supervised auditing technique using a subset scanning approach to detect anomalous patterns in LLM activations from pre-trained models.