no code implementations • 25 Aug 2020 • Jan Neerbek
We show that our context-based approaches significantly outperforms the family of previous state-of-the-art approaches for sensitive information detection, so-called keyword-based approaches, on real-world data and with human labeled examples of sensitive and non-sensitive documents.
no code implementations • LREC 2020 • Jan Neerbek, Morten Eskildsen, Peter Dolog, Ira Assent
In this work we present a corpus for the evaluation of sensitive information detection approaches that addresses the need for real world sensitive information for empirical studies.