Multi-Strategy Knowledge Distillation Based Teacher-Student Framework for Machine Reading Comprehension

“The irrelevant information in documents poses a great challenge for machine reading compre-hension (MRC). To deal with such a challenge current MRC models generally fall into twoseparate parts: evidence extraction and answer prediction where the former extracts the key evi-dence corresponding to the question and the latter predicts the answer based on those sentences.However such pipeline paradigms tend to accumulate errors i.e. extracting the incorrect evi-dence results in predicting the wrong answer. In order to address this problem we propose aMulti-Strategy Knowledge Distillation based Teacher-Student framework (MSKDTS) for ma-chine reading comprehension. In our approach we first take evidence and document respec-tively as the input reference information to build a teacher model and a student model. Then the multi-strategy knowledge distillation method transfers the knowledge from the teacher model to the student model at both feature and prediction level through knowledge distillation approach.Therefore in the testing phase the enhanced student model can predict answer similar to the teacher model without being aware of which sentence is the corresponding evidence in the docu-ment. Experimental results on the ReCO dataset demonstrate the effectiveness of our approachand further ablation studies prove the effectiveness of both knowledge distillation strategies.”

PDF Abstract

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods