Later Span Adaptation for Language Understanding

1 Jan 2021  ·  Rongzhou Bao, Zhuosheng Zhang, Hai Zhao ·

Pre-trained contextualized language models (PrLMs) broadly use fine-grained tokens (words or sub-words) as minimal linguistic unit in pre-training phase. Introducing span-level information in pre-training has shown capable of further enhancing PrLMs. However, such methods require enormous resources and are lack of adaptivity due to huge computational requirement from pre-training. Instead of too early fixing the linguistic unit input as nearly all previous work did, we propose a novel method that combines span-level information into the representations generated by PrLMs during fine-tuning phase for better flexibility. In this way, the modeling procedure of span-level texts can be more adaptive to different downstream tasks. In detail, we divide the sentence into several spans according to the segmentation generated by a pre-sampled dictionary. Based on the sub-token-level representation provided by PrLMs, we enhance the connection between the tokens in each span and gain a representation with enhanced span-level information. Experiments are conducted on GLUE benchmark and prove that our approach could remarkably enhance the performance of PrLMs in various natural language understanding tasks.

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here