In recent years, pre-trained language models (PLMs) have been shown to capture factual knowledge from massive texts, which encourages the proposal of PLM-based knowledge graph completion (KGC) models.
Existing relation extraction (RE) methods typically focus on extracting relational facts between entity pairs within single sentences or documents.
Existing reference-free metrics have obvious limitations for evaluating controlled text generation models.
no code implementations • 26 Mar 2022 • Sha Yuan, Hanyu Zhao, Shuai Zhao, Jiahong Leng, Yangxiao Liang, Xiaozhi Wang, Jifan Yu, Xin Lv, Zhou Shao, Jiaao He, Yankai Lin, Xu Han, Zhenghao Liu, Ning Ding, Yongming Rao, Yizhao Gao, Liang Zhang, Ming Ding, Cong Fang, Yisen Wang, Mingsheng Long, Jing Zhang, Yinpeng Dong, Tianyu Pang, Peng Cui, Lingxiao Huang, Zheng Liang, HuaWei Shen, HUI ZHANG, Quanshi Zhang, Qingxiu Dong, Zhixing Tan, Mingxuan Wang, Shuo Wang, Long Zhou, Haoran Li, Junwei Bao, Yingwei Pan, Weinan Zhang, Zhou Yu, Rui Yan, Chence Shi, Minghao Xu, Zuobai Zhang, Guoqiang Wang, Xiang Pan, Mengjie Li, Xiaoyu Chu, Zijun Yao, Fangwei Zhu, Shulin Cao, Weicheng Xue, Zixuan Ma, Zhengyan Zhang, Shengding Hu, Yujia Qin, Chaojun Xiao, Zheni Zeng, Ganqu Cui, Weize Chen, Weilin Zhao, Yuan YAO, Peng Li, Wenzhao Zheng, Wenliang Zhao, Ziyi Wang, Borui Zhang, Nanyi Fei, Anwen Hu, Zenan Ling, Haoyang Li, Boxi Cao, Xianpei Han, Weidong Zhan, Baobao Chang, Hao Sun, Jiawen Deng, Chujie Zheng, Juanzi Li, Lei Hou, Xigang Cao, Jidong Zhai, Zhiyuan Liu, Maosong Sun, Jiwen Lu, Zhiwu Lu, Qin Jin, Ruihua Song, Ji-Rong Wen, Zhouchen Lin, LiWei Wang, Hang Su, Jun Zhu, Zhifang Sui, Jiajun Zhang, Yang Liu, Xiaodong He, Minlie Huang, Jian Tang, Jie Tang
With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm.
Pre-trained language models (PLMs) cannot well recall rich factual knowledge of entities exhibited in large-scale corpora, especially those rare entities.
As many fine-tuned pre-trained language models~(PLMs) with promising performance are generously released, investigating better ways to reuse these models is vital as it can greatly reduce the retraining computational cost and the potential environmental side-effects.
To explore whether we can improve PT via prompt transfer, we empirically investigate the transferability of soft prompts across different downstream tasks and PLMs in this work.
In the experiments, we study diverse few-shot NLP tasks and surprisingly find that in a 250-dimensional subspace found with 100 tasks, by only tuning 250 free parameters, we can recover 97% and 83% of the full prompt tuning performance for 100 seen tasks (using different training data) and 20 unseen tasks, respectively, showing great generalization ability of the found intrinsic task subspace.
Motivated by this observation, we construct a word-based robustness-aware perturbation to distinguish poisoned samples from clean samples to defend against the backdoor attacks on natural language processing (NLP) models.
The class imbalance problem, as an important issue in learning node representations, has drawn increasing attention from the community.
In this work, we study the computational patterns of FFNs and observe that most inputs only activate a tiny ratio of neurons of FFNs.
In particular, we propose a neighborhood-oriented packing strategy, which considers the neighbor spans integrally to better model the entity boundary information.
Ranked #1 on Named Entity Recognition on Few-NERD (SUP)
To determine whether TM models have adopted such heuristic, we introduce an adversarial evaluation scheme which invalidates the heuristic.
In this work, we point out a potential problem of current backdoor attacking research: its evaluation ignores the stealthiness of backdoor attacks, and most of existing backdoor attacking methods are not stealthy either to system deployers or to system users.
Event extraction (EE) has considerably benefited from pre-trained language models (PLMs) by fine-tuning.
Specifically, we introduce a pre-training framework named "knowledge inheritance" (KI) and explore how could knowledge distillation serve as auxiliary supervision during pre-training to efficiently learn larger PLMs.
To address this issue, we propose a dynamic token reduction approach to accelerate PLMs' inference, named TR-BERT, which could flexibly adapt the layer number of each token in inference to avoid redundant calculation.
Distantly supervised (DS) relation extraction (RE) has attracted much attention in the past few years as it can utilize large-scale auto-labeled data.
We then perform contrastive semi-supervised learning on both the retrieved unlabeled and original labeled instances to help PLMs capture crucial task-related semantic features.
This book aims to review and present the recent advances of distributed representation learning for NLP, including why representation learning can improve NLP, how representation learning takes part in various important topics of NLP, and what challenges are still not well addressed by distributed representation.
Pre-trained Language Models (PLMs) have shown superior performance on various downstream Natural Language Processing (NLP) tasks.
On the other hand, the exiting decisions made by internal classifiers are unreliable, leading to wrongly emitted early predictions.
Graph Contrastive Learning (GCL) has proven highly effective in promoting the performance of Semi-Supervised Node Classification (SSNC).
Knowledge graph embedding (KGE), aiming to embed entities and relations into low-dimensional vectors, has attracted wide attention recently.
Graph embedding (GE) methods embed nodes (and/or edges) in graph into a low-dimensional semantic space, and have shown its effectiveness in modeling multi-relational data.
We find that (i) while context is the main source to support the predictions, RE models also heavily rely on the information from entity mentions, most of which is type information, and (ii) existing datasets may leak shallow heuristics via entity mentions and thus contribute to the high performance on RE benchmarks.
Ranked #17 on Relation Extraction on TACRED
In this paper, we propose a novel framework named Coke to dynamically select contextual knowledge and embed knowledge context according to textual context for PLMs, which can avoid the effect of redundant and ambiguous knowledge in KGs that cannot match the input text.
Continual relation learning aims to continually train a model on new data to learn incessantly emerging novel relations while avoiding catastrophically forgetting old relations.
By dynamically determining segment length and deleting repetitive segments, RecoverSAT is capable of recovering from repetitive and missing token errors.
Most existing datasets exhibit the following issues that limit further development of ED: (1) Data scarcity.
Language representation models such as BERT could effectively capture contextual semantic information from plain text, and have been proved to achieve promising results in lots of downstream NLP tasks with appropriate fine-tuning.
Ranked #26 on Relation Extraction on DocRED
Relational facts are an important component of human knowledge, which are hidden in vast amounts of text.
To address this issue, we propose to model long-distance node relations by simply relying on shallow GNN architectures with two solutions: (1) Implicitly modelling by learning to predict node pair relations (2) Explicitly modelling by adding edges between nodes that potentially have the same label.
Non-autoregressive neural machine translation (NAT) generates each target word in parallel and has achieved promising inference acceleration.
Multi-paragraph reasoning is indispensable for open-domain question answering (OpenQA), which receives less attention in the current OpenQA systems.
Ranked #60 on Question Answering on HotpotQA
Numerical reasoning, such as addition, subtraction, sorting and counting is a critical skill in human's reading comprehension, which has not been well considered in existing machine reading comprehension (MRC) systems.
Ranked #7 on Question Answering on DROP Test
Graph Neural Networks (GNNs) have achieved promising performance on a wide range of graph-based tasks.
Ranked #48 on Node Classification on Cora
Experimental results show that the multilingual BERT model achieves the best results in almost all target languages, while the performance of cross-lingual OpenQA is still much lower than that of English.
Multiple entities in a document generally exhibit complex inter-sentence relations, and cannot be well handled by existing relation extraction (RE) methods that typically focus on extracting intra-sentence relations for single entity pairs.
Ranked #51 on Relation Extraction on DocRED
Recently, progress has been made towards improving relational reasoning in machine learning field.
Knowledge representation learning (KRL) aims to represent entities and relations in knowledge graph in low-dimensional semantic space, which have been widely used in massive knowledge-driven tasks.
To demonstrate the effectiveness of DIAG-NRE, we apply it to two real-world datasets and present both significant and interpretable improvements over state-of-the-art methods.
We release an open toolkit for knowledge embedding (OpenKE), which provides a unified framework and various fundamental models to embed knowledge graphs into a continuous low-dimensional space.
We propose a novel framework to model correlations between sememes and multi-lingual words in low-dimensional semantic space for sememe prediction.
The open-domain question answering (OpenQA) task aims to extract answers that match specific questions from a distantly supervised corpus.
To address these issues, we propose an adversarial multi-lingual neural relation extraction (AMNRE) model, which builds both consistent and individual representations for each sentence to consider the consistency and diversity among languages.
Distantly supervised open-domain question answering (DS-QA) aims to find answers in collections of unlabeled text.
Ranked #2 on Open-Domain Question Answering on Quasar
Representation learning of knowledge bases (KBs) aims to embed both entities and relations into a low-dimensional space.
Knowledge graph completion aims to perform link prediction between entities.