2 code implementations • 10 Jan 2022 • Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu
In such a scenario, which we call Language-Model-as-a-Service (LMaaS), the gradients of PTMs are usually unavailable.
1 code implementation • 13 Sep 2021 • Yunfan Shao, Zhichao Geng, Yitao Liu, Junqi Dai, Hang Yan, Fei Yang, Li Zhe, Hujun Bao, Xipeng Qiu
In this paper, we take the advantage of previous pre-trained models (PTMs) and propose a novel Chinese Pre-trained Unbalanced Transformer (CPT).
1 code implementation • ACL 2021 • Xiaonan Li, Yunfan Shao, Tianxiang Sun, Hang Yan, Xipeng Qiu, Xuanjing Huang
To alleviate this problem, we extend the recent successful early-exit mechanism to accelerate the inference of PTMs for sequence labeling tasks.
no code implementations • 29 Dec 2020 • Linyang Li, Yunfan Shao, Demin Song, Xipeng Qiu, Xuanjing Huang
The substitutions in the generated adversarial examples are not characters or words but \textit{'pieces'}, which are more natural to Chinese readers.
1 code implementation • COLING 2020 • Tianxiang Sun, Yunfan Shao, Xipeng Qiu, Qipeng Guo, Yaru Hu, Xuanjing Huang, Zheng Zhang
With the emerging branch of incorporating factual knowledge into pre-trained language models such as BERT, most existing models consider shallow, static, and separately pre-trained entity embeddings, which limits the performance gains of these models.
3 code implementations • 18 Mar 2020 • Xipeng Qiu, Tianxiang Sun, Yige Xu, Yunfan Shao, Ning Dai, Xuanjing Huang
Recently, the emergence of pre-trained models (PTMs) has brought natural language processing (NLP) to a new era.
1 code implementation • 12 Nov 2019 • Tianxiang Sun, Yunfan Shao, Xiaonan Li, PengFei Liu, Hang Yan, Xipeng Qiu, Xuanjing Huang
Most existing deep multi-task learning models are based on parameter sharing, such as hard sharing, hierarchical sharing, and soft sharing.
2 code implementations • NAACL 2019 • Qipeng Guo, Xipeng Qiu, PengFei Liu, Yunfan Shao, xiangyang xue, Zheng Zhang
Although Transformer has achieved great successes on many NLP tasks, its heavy structure with fully-connected attention connections leads to dependencies on large training data.
Ranked #12 on
Sentiment Analysis
on SST-5 Fine-grained classification
Named Entity Recognition (NER)
Natural Language Inference
+2