De-identification

20 papers with code • 0 benchmarks • 0 datasets

De-identification is the task of detecting privacy-related entities in text, such as person names, emails and contact data.

Most implemented papers

Synthesis of Realistic ECG using Generative Adversarial Networks

Brophy-E/ECG_GAN_MBD 19 Sep 2019

Finally, we discuss the privacy concerns associated with sharing synthetic data produced by GANs and test their ability to withstand a simple membership inference attack.

Face Identity Disentanglement via Latent Space Mapping

YotamNitzan/ID-disentanglement 15 May 2020

Learning disentangled representations of data is a fundamental problem in artificial intelligence.

Publicly Available Clinical BERT Embeddings

EmilyAlsentzer/clinicalBERT WS 2019

Contextual word embedding models such as ELMo (Peters et al., 2018) and BERT (Devlin et al., 2018) have dramatically improved performance for many natural language processing (NLP) tasks in recent months.

Speech Pseudonymisation Assessment Using Voice Similarity Matrices

Voice-Privacy-Challenge/Voice-Privacy-Challenge-2020 30 Aug 2020

The proliferation of speech technologies and rising privacy legislation calls for the development of privacy preservation solutions for speech applications.

De-identification of Patient Notes with Recurrent Neural Networks

Franck-Dernoncourt/NeuroNER 10 Jun 2016

It yields an F1-score of 97. 85 on the i2b2 2014 dataset, with a recall 97. 38 and a precision of 97. 32, and an F1-score of 99. 23 on the MIMIC de-identification dataset, with a recall 99. 25 and a precision of 99. 06.

Natural Language Generation for Electronic Health Records

scotthlee/nrc 1 Jun 2018

A variety of methods existing for generating synthetic electronic health records (EHRs), but they are not capable of generating unstructured text, like emergency department (ED) chief complaints, history of present illness or progress notes.

DEDUCE: A pattern matching method for automatic de-identification of Dutch medical text

vmenger/deduce Telematics and Informatics 2018

In order to use medical text for research purposes, it is necessary to de-identify the text for legal and privacy reasons.

Towards Automatic Generation of Shareable Synthetic Clinical Notes Using Neural Language Models

orenmel/synth-clinical-notes WS 2019

Large-scale clinical data is invaluable to driving many computational scientific advances today.

Feature Robustness in Non-stationary Health Records: Caveats to Deployable Model Performance in Common Clinical Machine Learning Tasks

MLforHealth/MIMIC_Generalisation 2 Aug 2019

When training clinical prediction models from electronic health records (EHRs), a key concern should be a model's ability to sustain performance over time when deployed, even as care practices, database systems, and population demographics evolve.