no code implementations • RANLP 2021 • Haytham Elfdaeel, Stanislav Peshterliev
To reduce computational cost and latency, we propose decoupling the transformer MRC model into input-component and cross-component.
Knowledge Distillation Machine Reading Comprehension +1