We extract a small corpus from a noisy source--the Electronic Health Record (EHR)--for the task of summarizing a hospital admission from multiple notes.
Traditionally, the primary goal of LL is to achieve the trade-off between the Stability (remembering past tasks) and Plasticity (adapting to new tasks).
Deep neural networks have achieved impressive performance on a variety of domains.
Our gazetteer based fusion model is data efficient, achieving +1. 7 micro-F1 gains on the i2b2 dataset using 20% training data, and brings + 4. 7 micro-F1 gains on novel entity mentions never presented during training.
However, in order to achieve state-of-the-art performance on out of domain tasks such as clinical named entity recognition and relation extraction, additional in domain pre-training is required.
In this paper, we provide an in-depth analysis of KL-divergence minimization in Forward and Backward orders, which shows that learners are reinforced via on-policy learning in Backward.
Sequence-to-sequence models such as transformers, which are now being used in a wide variety of NLP tasks, typically need to have very high capacity in order to perform well.
We develop the first approximate inference algorithm for 1-Best (and M-Best) decoding in bidirectional neural sequence models by extending Beam Search (BS) to reason about both forward and backward time dependencies.
We observe that our method consistently outperforms BS and previously proposed techniques for diverse decoding from neural sequence models.
This paper formulates the search for a set of bounding boxes (as needed in object proposal generation) as a monotone submodular maximization problem over the space of all possible bounding boxes in an image.