Paper

Discovering Invariances in Healthcare Neural Networks

We study the invariance characteristics of pre-trained predictive models by empirically learning transformations on the input that leave the prediction function approximately unchanged. To learn invariant transformations, we minimize the Wasserstein distance between the predictive distribution conditioned on the data instances and the predictive distribution conditioned on the transformed data instances. To avoid finding degenerate or perturbative transformations, we add a similarity regularization to discourage similarity between the data and its transformed values. We theoretically analyze the correctness of the algorithm and the structure of the solutions. Applying the proposed technique to clinical time series data, we discover variables that commonly-used LSTM models do not rely on for their prediction, especially when the LSTM is trained to be adversarially robust. We also analyze the invariances of BioBERT on clinical notes and discover words that it is invariant to.

Results in Papers With Code
(↓ scroll down to see all results)