Attention mechanism is effective in both focusing the deep learning models on
relevant features and interpreting them. However, attentions may be unreliable
since the networks that generate them are often trained in a weakly-supervised
manner. To overcome this limitation, we introduce the notion of input-dependent
uncertainty to the attention mechanism, such that it generates attention for
each feature with varying degrees of noise based on the given input, to learn
larger variance on instances it is uncertain about. We learn this
Uncertainty-aware Attention (UA) mechanism using variational inference, and
validate it on various risk prediction tasks from electronic health records on
which our model significantly outperforms existing attention models. The
analysis of the learned attentions shows that our model generates attentions
that comply with clinicians' interpretation, and provide richer interpretation
via learned variance. Further evaluation of both the accuracy of the
uncertainty calibration and the prediction performance with "I don't know"
decision show that UA yields networks with high reliability as well.