1 code implementation • EMNLP 2021 • Peijie Jiang, Dingkun Long, Yueheng Sun, Meishan Zhang, Guangwei Xu, Pengjun Xie
Self-training is one promising solution for it, which struggles to construct a set of high-quality pseudo training instances for the target domain.
no code implementations • 22 May 2023 • Zehan Li, Yanzhao Zhang, Dingkun Long, Pengjun Xie
Recently, various studies have been directed towards exploring dense passage retrieval techniques employing pre-trained language models, among which the masked auto-encoder (MAE) pre-training architecture has emerged as the most promising.
2 code implementations • 27 Oct 2022 • Peijie Jiang, Dingkun Long, Yanzhao Zhang, Pengjun Xie, Meishan Zhang, Min Zhang
We apply BABERT for feature induction of Chinese sequence labeling tasks.
Ranked #1 on
Chinese Word Segmentation
on MSRA
Chinese Named Entity Recognition
Chinese Word Segmentation
+3
1 code implementation • 27 Oct 2022 • Dingkun Long, Yanzhao Zhang, Guangwei Xu, Pengjun Xie
Pre-trained language model (PTM) has been shown to yield powerful text representations for dense passage retrieval task.
1 code implementation • 21 May 2022 • Yanzhao Zhang, Dingkun Long, Guangwei Xu, Pengjun Xie
Existing text retrieval systems with state-of-the-art performance usually adopt a retrieve-then-reranking architecture due to the high computational cost of pre-trained language models and the large corpus size.
Ranked #1 on
Passage Re-Ranking
on MS MARCO
1 code implementation • 7 Mar 2022 • Dingkun Long, Qiong Gao, Kuan Zou, Guangwei Xu, Pengjun Xie, Ruijie Guo, Jian Xu, Guanjun Jiang, Luxi Xing, Ping Yang
We find that the performance of retrieval models trained on dataset from general domain will inevitably decrease on specific domain.
no code implementations • 24 Oct 2020 • Haoyu Zhang, Dingkun Long, Guangwei Xu, Pengjun Xie, Fei Huang, Ji Wang
Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document.
1 code implementation • ACL 2020 • Ning Ding, Dingkun Long, Guangwei Xu, Muhua Zhu, Pengjun Xie, Xiaobin Wang, Hai-Tao Zheng
In order to simultaneously alleviate these two issues, this paper proposes to couple distant annotation and adversarial training for cross-domain CWS.
no code implementations • ACL 2020 • Jie Zhou, Chunping Ma, Dingkun Long, Guangwei Xu, Ning Ding, Haoyu Zhang, Pengjun Xie, Gongshen Liu
Hierarchical text classification is an essential yet challenging subtask of multi-label text classification with a taxonomic hierarchy.
no code implementations • 3 Mar 2019 • Bokang Zhu, Richong Zhang, Dingkun Long, Yongyi Mao
Gated models resolve this conflict by adaptively adjusting their state-update equations, whereas Vanilla RNN resolves this conflict by assigning different dimensions different tasks.
no code implementations • 20 Nov 2016 • Dingkun Long, Richong Zhang, Yongyi Mao
For this purpose, we design a simple and controllable task, called ``memorization problem'', where the networks are trained to memorize certain targeted information.