We seek to address the lack of labeled data (and high cost of annotation) for
textual entailment in some domains. To that end, we first create (for
experimental purposes) an entailment dataset for the clinical domain, and a
highly competitive supervised entailment system, ENT, that is effective (out of
the box) on two domains...
We then explore self-training and active learning
strategies to address the lack of labeled data. With self-training, we
successfully exploit unlabeled data to improve over ENT by 15% F-score on the
newswire domain, and 13% F-score on clinical data. On the other hand, our
active learning experiments demonstrate that we can match (and even beat) ENT
using only 6.6% of the training data in the clinical domain, and only 5.8% of
the training data in the newswire domain.