no code implementations • 22 Jul 2021 • Gihyuk Ko, Gyumin Lim
From this insight, we propose an unsupervised detection of adversarial examples using reconstructor networks trained only on model explanations of benign examples.
3 code implementations • 25 Jul 2017 • Anupam Datta, Matt Fredrikson, Gihyuk Ko, Piotr Mardziel, Shayak Sen
Machine learnt systems inherit biases against protected classes, historically disparaged groups, from training data.
no code implementations • 22 May 2017 • Anupam Datta, Matthew Fredrikson, Gihyuk Ko, Piotr Mardziel, Shayak Sen
For a specific instantiation of this definition, we present a program analysis technique that detects instances of proxy use in a model, and provides a witness that identifies which parts of the corresponding program exhibit the behavior.