no code implementations • Findings (ACL) 2022 • Xin Lv, Yankai Lin, Yixin Cao, Lei Hou, Juanzi Li, Zhiyuan Liu, Peng Li, Jie zhou
In recent years, pre-trained language models (PLMs) have been shown to capture factual knowledge from massive texts, which encourages the proposal of PLM-based knowledge graph completion (KGC) models.
1 code implementation • EMNLP 2021 • Yuan YAO, Jiaju Du, Yankai Lin, Peng Li, Zhiyuan Liu, Jie zhou, Maosong Sun
Existing relation extraction (RE) methods typically focus on extracting relational facts between entity pairs within single sentences or documents.
1 code implementation • ACL 2022 • Pei Ke, Hao Zhou, Yankai Lin, Peng Li, Jie zhou, Xiaoyan Zhu, Minlie Huang
Existing reference-free metrics have obvious limitations for evaluating controlled text generation models.
no code implementations • 26 Mar 2022 • Sha Yuan, Hanyu Zhao, Shuai Zhao, Jiahong Leng, Yangxiao Liang, Xiaozhi Wang, Jifan Yu, Xin Lv, Zhou Shao, Jiaao He, Yankai Lin, Xu Han, Zhenghao Liu, Ning Ding, Yongming Rao, Yizhao Gao, Liang Zhang, Ming Ding, Cong Fang, Yisen Wang, Mingsheng Long, Jing Zhang, Yinpeng Dong, Tianyu Pang, Peng Cui, Lingxiao Huang, Zheng Liang, HuaWei Shen, HUI ZHANG, Quanshi Zhang, Qingxiu Dong, Zhixing Tan, Mingxuan Wang, Shuo Wang, Long Zhou, Haoran Li, Junwei Bao, Yingwei Pan, Weinan Zhang, Zhou Yu, Rui Yan, Chence Shi, Minghao Xu, Zuobai Zhang, Guoqiang Wang, Xiang Pan, Mengjie Li, Xiaoyu Chu, Zijun Yao, Fangwei Zhu, Shulin Cao, Weicheng Xue, Zixuan Ma, Zhengyan Zhang, Shengding Hu, Yujia Qin, Chaojun Xiao, Zheni Zeng, Ganqu Cui, Weize Chen, Weilin Zhao, Yuan YAO, Peng Li, Wenzhao Zheng, Wenliang Zhao, Ziyi Wang, Borui Zhang, Nanyi Fei, Anwen Hu, Zenan Ling, Haoyang Li, Boxi Cao, Xianpei Han, Weidong Zhan, Baobao Chang, Hao Sun, Jiawen Deng, Chujie Zheng, Juanzi Li, Lei Hou, Xigang Cao, Jidong Zhai, Zhiyuan Liu, Maosong Sun, Jiwen Lu, Zhiwu Lu, Qin Jin, Ruihua Song, Ji-Rong Wen, Zhouchen Lin, LiWei Wang, Hang Su, Jun Zhu, Zhifang Sui, Jiajun Zhang, Yang Liu, Xiaodong He, Minlie Huang, Jian Tang, Jie Tang
With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm.
1 code implementation • Findings (ACL) 2022 • Yujia Qin, Jiajie Zhang, Yankai Lin, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
We experiment ELLE with streaming data from 5 domains on BERT and GPT.
1 code implementation • ACL 2022 • Deming Ye, Yankai Lin, Peng Li, Maosong Sun, Zhiyuan Liu
Pre-trained language models (PLMs) cannot well recall rich factual knowledge of entities exhibited in large-scale corpora, especially those rare entities.
no code implementations • 14 Dec 2021 • Lei LI, Yankai Lin, Xuancheng Ren, Guangxiang Zhao, Peng Li, Jie zhou, Xu sun
As many fine-tuned pre-trained language models~(PLMs) with promising performance are generously released, investigating better ways to reuse these models is vital as it can greatly reduce the retraining computational cost and the potential environmental side-effects.
1 code implementation • 12 Nov 2021 • Yusheng Su, Xiaozhi Wang, Yujia Qin, Chi-Min Chan, Yankai Lin, Huadong Wang, Kaiyue Wen, Zhiyuan Liu, Peng Li, Juanzi Li, Lei Hou, Maosong Sun, Jie zhou
To explore whether we can improve PT via prompt transfer, we empirically investigate the transferability of soft prompts across different downstream tasks and PLMs in this work.
Natural Language Processing
Natural Language Understanding
+2
1 code implementation • 15 Oct 2021 • Yujia Qin, Xiaozhi Wang, Yusheng Su, Yankai Lin, Ning Ding, Jing Yi, Weize Chen, Zhiyuan Liu, Juanzi Li, Lei Hou, Peng Li, Maosong Sun, Jie zhou
In the experiments, we study diverse few-shot NLP tasks and surprisingly find that in a 250-dimensional subspace found with 100 tasks, by only tuning 250 free parameters, we can recover 97% and 83% of the full prompt tuning performance for 100 seen tasks (using different training data) and 20 unseen tasks, respectively, showing great generalization ability of the found intrinsic task subspace.
1 code implementation • EMNLP 2021 • Wenkai Yang, Yankai Lin, Peng Li, Jie zhou, Xu sun
Motivated by this observation, we construct a word-based robustness-aware perturbation to distinguish poisoned samples from clean samples to defend against the backdoor attacks on natural language processing (NLP) models.
1 code implementation • NeurIPS 2021 • Deli Chen, Yankai Lin, Guangxiang Zhao, Xuancheng Ren, Peng Li, Jie zhou, Xu sun
The class imbalance problem, as an important issue in learning node representations, has drawn increasing attention from the community.
1 code implementation • Findings (ACL) 2022 • Zhengyan Zhang, Yankai Lin, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
In this work, we study the computational patterns of FFNs and observe that most inputs only activate a tiny ratio of neurons of FFNs.
1 code implementation • EMNLP 2021 • Lei LI, Yankai Lin, Shuhuai Ren, Peng Li, Jie zhou, Xu sun
Knowledge distillation~(KD) has been proved effective for compressing large-scale pre-trained language models.
1 code implementation • ACL 2022 • Deming Ye, Yankai Lin, Peng Li, Maosong Sun
In particular, we propose a neighborhood-oriented packing strategy, which considers the neighbor spans integrally to better model the entity boundary information.
Ranked #1 on
Named Entity Recognition
on Few-NERD (SUP)
Joint Entity and Relation Extraction
Named Entity Recognition
no code implementations • Findings (ACL) 2022 • Lan Jiang, Tianshu Lyu, Yankai Lin, Meng Chong, Xiaoyong Lyu, Dawei Yin
To determine whether TM models have adopted such heuristic, we introduce an adversarial evaluation scheme which invalidates the heuristic.
1 code implementation • ACL 2021 • Wenkai Yang, Yankai Lin, Peng Li, Jie zhou, Xu sun
In this work, we point out a potential problem of current backdoor attacking research: its evaluation ignores the stealthiness of backdoor attacks, and most of existing backdoor attacking methods are not stealthy either to system deployers or to system users.
1 code implementation • ACL 2022 • Weize Chen, Xu Han, Yankai Lin, Hexu Zhao, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
Hyperbolic neural networks have shown great potential for modeling complex data.
1 code implementation • ACL 2021 • Ziqi Wang, Xiaozhi Wang, Xu Han, Yankai Lin, Lei Hou, Zhiyuan Liu, Peng Li, Juanzi Li, Jie zhou
Event extraction (EE) has considerably benefited from pre-trained language models (PLMs) by fine-tuning.
2 code implementations • 28 May 2021 • Yujia Qin, Yankai Lin, Jing Yi, Jiajie Zhang, Xu Han, Zhengyan Zhang, Yusheng Su, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
Specifically, we introduce a pre-training framework named "knowledge inheritance" (KI) and explore how could knowledge distillation serve as auxiliary supervision during pre-training to efficiently learn larger PLMs.
1 code implementation • NAACL 2021 • Deming Ye, Yankai Lin, Yufei Huang, Maosong Sun
To address this issue, we propose a dynamic token reduction approach to accelerate PLMs' inference, named TR-BERT, which could flexibly adapt the layer number of each token in inference to avoid redundant calculation.
1 code implementation • Findings (ACL) 2021 • Tianyu Gao, Xu Han, Keyue Qiu, Yuzhuo Bai, Zhiyu Xie, Yankai Lin, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
Distantly supervised (DS) relation extraction (RE) has attracted much attention in the past few years as it can utilize large-scale auto-labeled data.
1 code implementation • 7 Feb 2021 • Yusheng Su, Xu Han, Yankai Lin, Zhengyan Zhang, Zhiyuan Liu, Peng Li, Jie zhou, Maosong Sun
We then perform contrastive semi-supervised learning on both the retrieved unlabeled and original labeled instances to help PLMs capture crucial task-related semantic features.
no code implementations • 7 Feb 2021 • Zhiyuan Liu, Yankai Lin, Maosong Sun
This book aims to review and present the recent advances of distributed representation learning for NLP, including why representation learning can improve NLP, how representation learning takes part in various important topics of NLP, and what challenges are still not well addressed by distributed representation.
1 code implementation • ACL 2021 • Yujia Qin, Yankai Lin, Ryuichi Takanobu, Zhiyuan Liu, Peng Li, Heng Ji, Minlie Huang, Maosong Sun, Jie zhou
Pre-trained Language Models (PLMs) have shown superior performance on various downstream Natural Language Processing (NLP) tasks.
1 code implementation • Findings (EMNLP) 2021 • Lei LI, Yankai Lin, Deli Chen, Shuhuai Ren, Peng Li, Jie zhou, Xu sun
On the other hand, the exiting decisions made by internal classifiers are unreliable, leading to wrongly emitted early predictions.
no code implementations • 14 Dec 2020 • Deli Chen, Yankai Lin, Lei LI, Xuancheng Ren, Peng Li, Jie zhou, Xu sun
Graph Contrastive Learning (GCL) has proven highly effective in promoting the performance of Semi-Supervised Node Classification (SSNC).
no code implementations • 28 Oct 2020 • Xiaoyu Kou, Yankai Lin, Yuntao Li, Jiahao Xu, Peng Li, Jie zhou, Yan Zhang
Knowledge graph embedding (KGE), aiming to embed entities and relations into low-dimensional vectors, has attracted wide attention recently.
1 code implementation • EMNLP 2020 • Xiaoyu Kou, Yankai Lin, Shaobo Liu, Peng Li, Jie zhou, Yan Zhang
Graph embedding (GE) methods embed nodes (and/or edges) in graph into a low-dimensional semantic space, and have shown its effectiveness in modeling multi-relational data.
1 code implementation • EMNLP 2020 • Hao Peng, Tianyu Gao, Xu Han, Yankai Lin, Peng Li, Zhiyuan Liu, Maosong Sun, Jie zhou
We find that (i) while context is the main source to support the predictions, RE models also heavily rely on the information from entity mentions, most of which is type information, and (ii) existing datasets may leak shallow heuristics via entity mentions and thus contribute to the high performance on RE benchmarks.
Ranked #17 on
Relation Extraction
on TACRED
1 code implementation • 29 Sep 2020 • Yusheng Su, Xu Han, Zhengyan Zhang, Peng Li, Zhiyuan Liu, Yankai Lin, Jie zhou, Maosong Sun
In this paper, we propose a novel framework named Coke to dynamically select contextual knowledge and embed knowledge context according to textual context for PLMs, which can avoid the effect of redundant and ambiguous knowledge in KGs that cannot match the input text.
no code implementations • ACL 2020 • Xu Han, Yi Dai, Tianyu Gao, Yankai Lin, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
Continual relation learning aims to continually train a model on new data to learn incessantly emerging novel relations while avoiding catastrophically forgetting old relations.
1 code implementation • ACL 2020 • Qiu Ran, Yankai Lin, Peng Li, Jie zhou
By dynamically determining segment length and deleting repetitive segments, RecoverSAT is capable of recovering from repetitive and missing token errors.
1 code implementation • EMNLP 2020 • Xiaozhi Wang, Ziqi Wang, Xu Han, Wangyi Jiang, Rong Han, Zhiyuan Liu, Juanzi Li, Peng Li, Yankai Lin, Jie zhou
Most existing datasets exhibit the following issues that limit further development of ED: (1) Data scarcity.
2 code implementations • EMNLP 2020 • Deming Ye, Yankai Lin, Jiaju Du, Zheng-Hao Liu, Peng Li, Maosong Sun, Zhiyuan Liu
Language representation models such as BERT could effectively capture contextual semantic information from plain text, and have been proved to achieve promising results in lots of downstream NLP tasks with appropriate fine-tuning.
Ranked #26 on
Relation Extraction
on DocRED
no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Xu Han, Tianyu Gao, Yankai Lin, Hao Peng, Yaoliang Yang, Chaojun Xiao, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou
Relational facts are an important component of human knowledge, which are hidden in vast amounts of text.
no code implementations • 10 Nov 2019 • Deli Chen, Xiaoqian Liu, Yankai Lin, Peng Li, Jie zhou, Qi Su, Xu sun
To address this issue, we propose to model long-distance node relations by simply relying on shallow GNN architectures with two solutions: (1) Implicitly modelling by learning to predict node pair relations (2) Explicitly modelling by adding edges between nodes that potentially have the same label.
no code implementations • 6 Nov 2019 • Qiu Ran, Yankai Lin, Peng Li, Jie zhou
Non-autoregressive neural machine translation (NAT) generates each target word in parallel and has achieved promising inference acceleration.
no code implementations • 6 Nov 2019 • Deming Ye, Yankai Lin, Zheng-Hao Liu, Zhiyuan Liu, Maosong Sun
Multi-paragraph reasoning is indispensable for open-domain question answering (OpenQA), which receives less attention in the current OpenQA systems.
Ranked #60 on
Question Answering
on HotpotQA
3 code implementations • IJCNLP 2019 • Qiu Ran, Yankai Lin, Peng Li, Jie zhou, Zhiyuan Liu
Numerical reasoning, such as addition, subtraction, sorting and counting is a critical skill in human's reading comprehension, which has not been well considered in existing machine reading comprehension (MRC) systems.
Ranked #7 on
Question Answering
on DROP Test
no code implementations • 7 Sep 2019 • Deli Chen, Yankai Lin, Wei Li, Peng Li, Jie zhou, Xu sun
Graph Neural Networks (GNNs) have achieved promising performance on a wide range of graph-based tasks.
Ranked #48 on
Node Classification
on Cora
1 code implementation • ACL 2019 • Jiahua Liu, Yankai Lin, Zhiyuan Liu, Maosong Sun
Experimental results show that the multilingual BERT model achieves the best results in almost all target languages, while the performance of cross-lingual OpenQA is still much lower than that of English.
4 code implementations • ACL 2019 • Yuan Yao, Deming Ye, Peng Li, Xu Han, Yankai Lin, Zheng-Hao Liu, Zhiyuan Liu, Lixin Huang, Jie zhou, Maosong Sun
Multiple entities in a document generally exhibit complex inter-sentence relations, and cannot be well handled by existing relation extraction (RE) methods that typically focus on extracting intra-sentence relations for single entity pairs.
Ranked #51 on
Relation Extraction
on DocRED
1 code implementation • NAACL 2019 • Zihao Fu, Yankai Lin, Zhiyuan Liu, Wai Lam
We also propose a novel auto-encoder based facet component to estimate some facets of the fact.
1 code implementation • ACL 2019 • Hao Zhu, Yankai Lin, Zhiyuan Liu, Jie Fu, Tat-Seng Chua, Maosong Sun
Recently, progress has been made towards improving relational reasoning in machine learning field.
2 code implementations • 28 Dec 2018 • Yankai Lin, Xu Han, Ruobing Xie, Zhiyuan Liu, Maosong Sun
Knowledge representation learning (KRL) aims to represent entities and relations in knowledge graph in low-dimensional semantic space, which have been widely used in massive knowledge-driven tasks.
1 code implementation • ACL 2019 • Shun Zheng, Xu Han, Yankai Lin, Peilin Yu, Lu Chen, Ling Huang, Zhiyuan Liu, Wei Xu
To demonstrate the effectiveness of DIAG-NRE, we apply it to two real-world datasets and present both significant and interpretable improvements over state-of-the-art methods.
1 code implementation • EMNLP 2018 • Xu Han, Shulin Cao, Xin Lv, Yankai Lin, Zhiyuan Liu, Maosong Sun, Juanzi Li
We release an open toolkit for knowledge embedding (OpenKE), which provides a unified framework and various fundamental models to embed knowledge graphs into a continuous low-dimensional space.
1 code implementation • EMNLP 2018 • Fanchao Qi, Yankai Lin, Maosong Sun, Hao Zhu, Ruobing Xie, Zhiyuan Liu
We propose a novel framework to model correlations between sememes and multi-lingual words in low-dimensional semantic space for sememe prediction.
no code implementations • 27 Sep 2018 • Haozhe Ji, Yankai Lin, Zhiyuan Liu, Maosong Sun
The open-domain question answering (OpenQA) task aims to extract answers that match specific questions from a distantly supervised corpus.
1 code implementation • COLING 2018 • Xiaozhi Wang, Xu Han, Yankai Lin, Zhiyuan Liu, Maosong Sun
To address these issues, we propose an adversarial multi-lingual neural relation extraction (AMNRE) model, which builds both consistent and individual representations for each sentence to consider the consistency and diversity among languages.
1 code implementation • ACL 2018 • Yankai Lin, Haozhe Ji, Zhiyuan Liu, Maosong Sun
Distantly supervised open-domain question answering (DS-QA) aims to find answers in collections of unlabeled text.
Ranked #2 on
Open-Domain Question Answering
on Quasar
no code implementations • ACL 2017 • Yankai Lin, Zhiyuan Liu, Maosong Sun
Relation extraction has been widely used for finding unknown relational facts from plain text.
1 code implementation • EMNLP 2017 • Wenyuan Zeng, Yankai Lin, Zhiyuan Liu, Maosong Sun
Distantly supervised relation extraction has been widely used to find novel relational facts from plain text.
1 code implementation • EMNLP 2015 • Yankai Lin, Zhiyuan Liu, Huanbo Luan, Maosong Sun, Siwei Rao, Song Liu
Representation learning of knowledge bases (KBs) aims to embed both entities and relations into a low-dimensional space.
2 code implementations • AAAI 2015 2015 • Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, Xuan Zhu
Knowledge graph completion aims to perform link prediction between entities.