1 code implementation • EMNLP 2021 • Peijie Jiang, Dingkun Long, Yueheng Sun, Meishan Zhang, Guangwei Xu, Pengjun Xie
Self-training is one promising solution for it, which struggles to construct a set of high-quality pseudo training instances for the target domain.
1 code implementation • 27 Oct 2022 • Dingkun Long, Yanzhao Zhang, Guangwei Xu, Pengjun Xie
Pre-trained language model (PTM) has been shown to yield powerful text representations for dense passage retrieval task.
1 code implementation • 21 May 2022 • Yanzhao Zhang, Dingkun Long, Guangwei Xu, Pengjun Xie
Existing text retrieval systems with state-of-the-art performance usually adopt a retrieve-then-reranking architecture due to the high computational cost of pre-trained language models and the large corpus size.
Ranked #1 on
Passage Re-Ranking
on MS MARCO
1 code implementation • NAACL 2022 • Linzhi Wu, Pengjun Xie, Jie zhou, Meishan Zhang, Chunping Ma, Guangwei Xu, Min Zhang
Prior research has mainly resorted to heuristic rule-based constraints to reduce the noise for specific self-augmentation methods individually.
1 code implementation • ACL 2022 • Xin Zhang, Guangwei Xu, Yueheng Sun, Meishan Zhang, Xiaobin Wang, Min Zhang
Recent works of opinion expression identification (OEI) rely heavily on the quality and scale of the manually-constructed training corpus, which could be extremely difficult to satisfy.
1 code implementation • ACL 2022 • Yongliang Shen, Xiaobin Wang, Zeqi Tan, Guangwei Xu, Pengjun Xie, Fei Huang, Weiming Lu, Yueting Zhuang
Each instance query predicts one entity, and by feeding all instance queries simultaneously, we can query all entities in parallel.
Ranked #1 on
Nested Named Entity Recognition
on GENIA
Chinese Named Entity Recognition
named-entity-recognition
+4
1 code implementation • 7 Mar 2022 • Dingkun Long, Qiong Gao, Kuan Zou, Guangwei Xu, Pengjun Xie, Ruijie Guo, Jian Xu, Guanjun Jiang, Luxi Xing, Ping Yang
We find that the performance of retrieval models trained on dataset from general domain will inevitably decrease on specific domain.
1 code implementation • 17 Feb 2022 • Boli Chen, Guangwei Xu, Xiaobin Wang, Pengjun Xie, Meishan Zhang, Fei Huang
Named Entity Recognition (NER) from speech is among Spoken Language Understanding (SLU) tasks, aiming to extract semantic information from the speech signal.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+5
no code implementations • 24 Aug 2021 • Ning Ding, Yulin Chen, Xu Han, Guangwei Xu, Pengjun Xie, Hai-Tao Zheng, Zhiyuan Liu, Juanzi Li, Hong-Gee Kim
In this work, we investigate the application of prompt-learning on fine-grained entity typing in fully supervised, few-shot and zero-shot scenarios.
1 code implementation • ACL 2021 • Xin Zhang, Guangwei Xu, Yueheng Sun, Meishan Zhang, Pengjun Xie
Crowdsourcing is regarded as one prospective solution for effective supervised learning, aiming to build large-scale annotated training data by crowd workers.
5 code implementations • ACL 2021 • Ning Ding, Guangwei Xu, Yulin Chen, Xiaobin Wang, Xu Han, Pengjun Xie, Hai-Tao Zheng, Zhiyuan Liu
In this paper, we present Few-NERD, a large-scale human-annotated few-shot NER dataset with a hierarchy of 8 coarse-grained and 66 fine-grained entity types.
Ranked #5 on
Few-shot NER
on Few-NERD (INTRA)
1 code implementation • ICLR 2021 • Boli Chen, Yao Fu, Guangwei Xu, Pengjun Xie, Chuanqi Tan, Mosha Chen, Liping Jing
We introduce a Poincare probe, a structural probe projecting these embeddings into a Poincare subspace with explicitly defined hierarchies.
1 code implementation • ICLR 2021 • Ning Ding, Xiaobin Wang, Yao Fu, Guangwei Xu, Rui Wang, Pengjun Xie, Ying Shen, Fei Huang, Hai-Tao Zheng, Rui Zhang
This approach allows us to learn meaningful, interpretable prototypes for the final classification.
no code implementations • 24 Oct 2020 • Haoyu Zhang, Dingkun Long, Guangwei Xu, Pengjun Xie, Fei Huang, Ji Wang
Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document.
1 code implementation • ACL 2020 • Ning Ding, Dingkun Long, Guangwei Xu, Muhua Zhu, Pengjun Xie, Xiaobin Wang, Hai-Tao Zheng
In order to simultaneously alleviate these two issues, this paper proposes to couple distant annotation and adversarial training for cross-domain CWS.
no code implementations • ACL 2020 • Jie Zhou, Chunping Ma, Dingkun Long, Guangwei Xu, Ning Ding, Haoyu Zhang, Pengjun Xie, Gongshen Liu
Hierarchical text classification is an essential yet challenging subtask of multi-label text classification with a taxonomic hierarchy.
no code implementations • WS 2018 • Chen Li, Junpei Zhou, Zuyi Bao, Hengyou Liu, Guangwei Xu, Linlin Li
In the correction stage, candidates were generated by the three GEC models and then merged to output the final corrections for M and S types.
no code implementations • IJCNLP 2017 • Yi Yang, Pengjun Xie, Jun Tao, Guangwei Xu, Linlin Li, Luo Si
This paper introduces Alibaba NLP team system on IJCNLP 2017 shared task No.
Ranked #1 on
2D Human Pose Estimation
on Alibaba Cluster Trace
(using extra training data)