Pre-Training with Whole Word Masking for Chinese BERT

19 Jun 2019Yiming CuiWanxiang CheTing LiuBing QinZiqing YangShijin WangGuoping Hu

Bidirectional Encoder Representations from Transformers (BERT) has shown marvelous improvements across various NLP tasks. Recently, an upgraded version of BERT has been released with Whole Word Masking (WWM), which mitigate the drawbacks of masking partial WordPiece tokens in pre-training BERT... (read more)

PDF Abstract

Evaluation Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.