Search Results for author: Yankai Lin

Found 57 papers, 42 papers with code

Do Pre-trained Models Benefit Knowledge Graph Completion? A Reliable Evaluation and a Reasonable Approach

no code implementations Findings (ACL) 2022 Xin Lv, Yankai Lin, Yixin Cao, Lei Hou, Juanzi Li, Zhiyuan Liu, Peng Li, Jie zhou

In recent years, pre-trained language models (PLMs) have been shown to capture factual knowledge from massive texts, which encourages the proposal of PLM-based knowledge graph completion (KGC) models.

Knowledge Graph Completion Link Prediction

CodRED: A Cross-Document Relation Extraction Dataset for Acquiring Knowledge in the Wild

1 code implementation EMNLP 2021 Yuan YAO, Jiaju Du, Yankai Lin, Peng Li, Zhiyuan Liu, Jie zhou, Maosong Sun

Existing relation extraction (RE) methods typically focus on extracting relational facts between entity pairs within single sentences or documents.

Relation Extraction

A Simple but Effective Pluggable Entity Lookup Table for Pre-trained Language Models

1 code implementation ACL 2022 Deming Ye, Yankai Lin, Peng Li, Maosong Sun, Zhiyuan Liu

Pre-trained language models (PLMs) cannot well recall rich factual knowledge of entities exhibited in large-scale corpora, especially those rare entities.

Domain Adaptation

Model Uncertainty-Aware Knowledge Amalgamation for Pre-Trained Language Models

no code implementations14 Dec 2021 Lei LI, Yankai Lin, Xuancheng Ren, Guangxiang Zhao, Peng Li, Jie zhou, Xu sun

As many fine-tuned pre-trained language models~(PLMs) with promising performance are generously released, investigating better ways to reuse these models is vital as it can greatly reduce the retraining computational cost and the potential environmental side-effects.

On Transferability of Prompt Tuning for Natural Language Processing

1 code implementation12 Nov 2021 Yusheng Su, Xiaozhi Wang, Yujia Qin, Chi-Min Chan, Yankai Lin, Huadong Wang, Kaiyue Wen, Zhiyuan Liu, Peng Li, Juanzi Li, Lei Hou, Maosong Sun, Jie zhou

To explore whether we can improve PT via prompt transfer, we empirically investigate the transferability of soft prompts across different downstream tasks and PLMs in this work.

Natural Language Processing Natural Language Understanding +2

Exploring Universal Intrinsic Task Subspace via Prompt Tuning

1 code implementation15 Oct 2021 Yujia Qin, Xiaozhi Wang, Yusheng Su, Yankai Lin, Ning Ding, Jing Yi, Weize Chen, Zhiyuan Liu, Juanzi Li, Lei Hou, Peng Li, Maosong Sun, Jie zhou

In the experiments, we study diverse few-shot NLP tasks and surprisingly find that in a 250-dimensional subspace found with 100 tasks, by only tuning 250 free parameters, we can recover 97% and 83% of the full prompt tuning performance for 100 seen tasks (using different training data) and 20 unseen tasks, respectively, showing great generalization ability of the found intrinsic task subspace.

RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models

1 code implementation EMNLP 2021 Wenkai Yang, Yankai Lin, Peng Li, Jie zhou, Xu sun

Motivated by this observation, we construct a word-based robustness-aware perturbation to distinguish poisoned samples from clean samples to defend against the backdoor attacks on natural language processing (NLP) models.

Natural Language Processing Sentiment Analysis

Topology-Imbalance Learning for Semi-Supervised Node Classification

1 code implementation NeurIPS 2021 Deli Chen, Yankai Lin, Guangxiang Zhao, Xuancheng Ren, Peng Li, Jie zhou, Xu sun

The class imbalance problem, as an important issue in learning node representations, has drawn increasing attention from the community.

Classification Node Classification

Dynamic Knowledge Distillation for Pre-trained Language Models

1 code implementation EMNLP 2021 Lei LI, Yankai Lin, Shuhuai Ren, Peng Li, Jie zhou, Xu sun

Knowledge distillation~(KD) has been proved effective for compressing large-scale pre-trained language models.

Knowledge Distillation

Packed Levitated Marker for Entity and Relation Extraction

1 code implementation ACL 2022 Deming Ye, Yankai Lin, Peng Li, Maosong Sun

In particular, we propose a neighborhood-oriented packing strategy, which considers the neighbor spans integrally to better model the entity boundary information.

Joint Entity and Relation Extraction Named Entity Recognition

On Length Divergence Bias in Textual Matching Models

no code implementations Findings (ACL) 2022 Lan Jiang, Tianshu Lyu, Yankai Lin, Meng Chong, Xiaoyong Lyu, Dawei Yin

To determine whether TM models have adopted such heuristic, we introduce an adversarial evaluation scheme which invalidates the heuristic.

Semantic Similarity Semantic Textual Similarity

Rethinking Stealthiness of Backdoor Attack against NLP Models

1 code implementation ACL 2021 Wenkai Yang, Yankai Lin, Peng Li, Jie zhou, Xu sun

In this work, we point out a potential problem of current backdoor attacking research: its evaluation ignores the stealthiness of backdoor attacks, and most of existing backdoor attacking methods are not stealthy either to system deployers or to system users.

Backdoor Attack Data Augmentation +3

Fully Hyperbolic Neural Networks

1 code implementation ACL 2022 Weize Chen, Xu Han, Yankai Lin, Hexu Zhao, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou

Hyperbolic neural networks have shown great potential for modeling complex data.

Knowledge Inheritance for Pre-trained Language Models

2 code implementations28 May 2021 Yujia Qin, Yankai Lin, Jing Yi, Jiajie Zhang, Xu Han, Zhengyan Zhang, Yusheng Su, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou

Specifically, we introduce a pre-training framework named "knowledge inheritance" (KI) and explore how could knowledge distillation serve as auxiliary supervision during pre-training to efficiently learn larger PLMs.

Domain Adaptation Knowledge Distillation +2

TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference

1 code implementation NAACL 2021 Deming Ye, Yankai Lin, Yufei Huang, Maosong Sun

To address this issue, we propose a dynamic token reduction approach to accelerate PLMs' inference, named TR-BERT, which could flexibly adapt the layer number of each token in inference to avoid redundant calculation.

Manual Evaluation Matters: Reviewing Test Protocols of Distantly Supervised Relation Extraction

1 code implementation Findings (ACL) 2021 Tianyu Gao, Xu Han, Keyue Qiu, Yuzhuo Bai, Zhiyu Xie, Yankai Lin, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou

Distantly supervised (DS) relation extraction (RE) has attracted much attention in the past few years as it can utilize large-scale auto-labeled data.

Relation Extraction

CSS-LM: A Contrastive Framework for Semi-supervised Fine-tuning of Pre-trained Language Models

1 code implementation7 Feb 2021 Yusheng Su, Xu Han, Yankai Lin, Zhengyan Zhang, Zhiyuan Liu, Peng Li, Jie zhou, Maosong Sun

We then perform contrastive semi-supervised learning on both the retrieved unlabeled and original labeled instances to help PLMs capture crucial task-related semantic features.

Representation Learning for Natural Language Processing

no code implementations7 Feb 2021 Zhiyuan Liu, Yankai Lin, Maosong Sun

This book aims to review and present the recent advances of distributed representation learning for NLP, including why representation learning can improve NLP, how representation learning takes part in various important topics of NLP, and what challenges are still not well addressed by distributed representation.

Natural Language Processing Representation Learning

Rethinking the Promotion Brought by Contrastive Learning to Semi-Supervised Node Classification

no code implementations14 Dec 2020 Deli Chen, Yankai Lin, Lei LI, Xuancheng Ren, Peng Li, Jie zhou, Xu sun

Graph Contrastive Learning (GCL) has proven highly effective in promoting the performance of Semi-Supervised Node Classification (SSNC).

Contrastive Learning Graph Learning +1

DisenE: Disentangling Knowledge Graph Embeddings

no code implementations28 Oct 2020 Xiaoyu Kou, Yankai Lin, Yuntao Li, Jiahao Xu, Peng Li, Jie zhou, Yan Zhang

Knowledge graph embedding (KGE), aiming to embed entities and relations into low-dimensional vectors, has attracted wide attention recently.

Entity Embeddings Knowledge Graph Embedding +2

Disentangle-based Continual Graph Representation Learning

1 code implementation EMNLP 2020 Xiaoyu Kou, Yankai Lin, Shaobo Liu, Peng Li, Jie zhou, Yan Zhang

Graph embedding (GE) methods embed nodes (and/or edges) in graph into a low-dimensional semantic space, and have shown its effectiveness in modeling multi-relational data.

Continual Learning Graph Embedding +1

Learning from Context or Names? An Empirical Study on Neural Relation Extraction

1 code implementation EMNLP 2020 Hao Peng, Tianyu Gao, Xu Han, Yankai Lin, Peng Li, Zhiyuan Liu, Maosong Sun, Jie zhou

We find that (i) while context is the main source to support the predictions, RE models also heavily rely on the information from entity mentions, most of which is type information, and (ii) existing datasets may leak shallow heuristics via entity mentions and thus contribute to the high performance on RE benchmarks.

Relation Extraction

CokeBERT: Contextual Knowledge Selection and Embedding towards Enhanced Pre-Trained Language Models

1 code implementation29 Sep 2020 Yusheng Su, Xu Han, Zhengyan Zhang, Peng Li, Zhiyuan Liu, Yankai Lin, Jie zhou, Maosong Sun

In this paper, we propose a novel framework named Coke to dynamically select contextual knowledge and embed knowledge context according to textual context for PLMs, which can avoid the effect of redundant and ambiguous knowledge in KGs that cannot match the input text.

Knowledge Graphs

Continual Relation Learning via Episodic Memory Activation and Reconsolidation

no code implementations ACL 2020 Xu Han, Yi Dai, Tianyu Gao, Yankai Lin, Zhiyuan Liu, Peng Li, Maosong Sun, Jie zhou

Continual relation learning aims to continually train a model on new data to learn incessantly emerging novel relations while avoiding catastrophically forgetting old relations.

Continual Learning

Learning to Recover from Multi-Modality Errors for Non-Autoregressive Neural Machine Translation

1 code implementation ACL 2020 Qiu Ran, Yankai Lin, Peng Li, Jie zhou

By dynamically determining segment length and deleting repetitive segments, RecoverSAT is capable of recovering from repetitive and missing token errors.

Machine Translation Translation

Coreferential Reasoning Learning for Language Representation

2 code implementations EMNLP 2020 Deming Ye, Yankai Lin, Jiaju Du, Zheng-Hao Liu, Peng Li, Maosong Sun, Zhiyuan Liu

Language representation models such as BERT could effectively capture contextual semantic information from plain text, and have been proved to achieve promising results in lots of downstream NLP tasks with appropriate fine-tuning.

Relation Extraction

HighwayGraph: Modelling Long-distance Node Relations for Improving General Graph Neural Network

no code implementations10 Nov 2019 Deli Chen, Xiaoqian Liu, Yankai Lin, Peng Li, Jie zhou, Qi Su, Xu sun

To address this issue, we propose to model long-distance node relations by simply relying on shallow GNN architectures with two solutions: (1) Implicitly modelling by learning to predict node pair relations (2) Explicitly modelling by adding edges between nodes that potentially have the same label.

General Classification Node Classification

Guiding Non-Autoregressive Neural Machine Translation Decoding with Reordering Information

no code implementations6 Nov 2019 Qiu Ran, Yankai Lin, Peng Li, Jie zhou

Non-autoregressive neural machine translation (NAT) generates each target word in parallel and has achieved promising inference acceleration.

Machine Translation Translation

Multi-Paragraph Reasoning with Knowledge-enhanced Graph Neural Network

no code implementations6 Nov 2019 Deming Ye, Yankai Lin, Zheng-Hao Liu, Zhiyuan Liu, Maosong Sun

Multi-paragraph reasoning is indispensable for open-domain question answering (OpenQA), which receives less attention in the current OpenQA systems.

Open-Domain Question Answering

NumNet: Machine Reading Comprehension with Numerical Reasoning

3 code implementations IJCNLP 2019 Qiu Ran, Yankai Lin, Peng Li, Jie zhou, Zhiyuan Liu

Numerical reasoning, such as addition, subtraction, sorting and counting is a critical skill in human's reading comprehension, which has not been well considered in existing machine reading comprehension (MRC) systems.

Machine Reading Comprehension Question Answering

XQA: A Cross-lingual Open-domain Question Answering Dataset

1 code implementation ACL 2019 Jiahua Liu, Yankai Lin, Zhiyuan Liu, Maosong Sun

Experimental results show that the multilingual BERT model achieves the best results in almost all target languages, while the performance of cross-lingual OpenQA is still much lower than that of English.

Machine Translation Open-Domain Question Answering +1

DocRED: A Large-Scale Document-Level Relation Extraction Dataset

4 code implementations ACL 2019 Yuan Yao, Deming Ye, Peng Li, Xu Han, Yankai Lin, Zheng-Hao Liu, Zhiyuan Liu, Lixin Huang, Jie zhou, Maosong Sun

Multiple entities in a document generally exhibit complex inter-sentence relations, and cannot be well handled by existing relation extraction (RE) methods that typically focus on extracting intra-sentence relations for single entity pairs.

Relation Extraction

Knowledge Representation Learning: A Quantitative Review

2 code implementations28 Dec 2018 Yankai Lin, Xu Han, Ruobing Xie, Zhiyuan Liu, Maosong Sun

Knowledge representation learning (KRL) aims to represent entities and relations in knowledge graph in low-dimensional semantic space, which have been widely used in massive knowledge-driven tasks.

General Classification Information Retrieval +7

DIAG-NRE: A Neural Pattern Diagnosis Framework for Distantly Supervised Neural Relation Extraction

1 code implementation ACL 2019 Shun Zheng, Xu Han, Yankai Lin, Peilin Yu, Lu Chen, Ling Huang, Zhiyuan Liu, Wei Xu

To demonstrate the effectiveness of DIAG-NRE, we apply it to two real-world datasets and present both significant and interpretable improvements over state-of-the-art methods.

Relation Extraction

OpenKE: An Open Toolkit for Knowledge Embedding

1 code implementation EMNLP 2018 Xu Han, Shulin Cao, Xin Lv, Yankai Lin, Zhiyuan Liu, Maosong Sun, Juanzi Li

We release an open toolkit for knowledge embedding (OpenKE), which provides a unified framework and various fundamental models to embed knowledge graphs into a continuous low-dimensional space.

Information Retrieval Knowledge Graphs +2

Cross-lingual Lexical Sememe Prediction

1 code implementation EMNLP 2018 Fanchao Qi, Yankai Lin, Maosong Sun, Hao Zhu, Ruobing Xie, Zhiyuan Liu

We propose a novel framework to model correlations between sememes and multi-lingual words in low-dimensional semantic space for sememe prediction.

cross-lingual sememe prediction Learning Word Embeddings +1

Denoise while Aggregating: Collaborative Learning in Open-Domain Question Answering

no code implementations27 Sep 2018 Haozhe Ji, Yankai Lin, Zhiyuan Liu, Maosong Sun

The open-domain question answering (OpenQA) task aims to extract answers that match specific questions from a distantly supervised corpus.

Open-Domain Question Answering Reading Comprehension

Adversarial Multi-lingual Neural Relation Extraction

1 code implementation COLING 2018 Xiaozhi Wang, Xu Han, Yankai Lin, Zhiyuan Liu, Maosong Sun

To address these issues, we propose an adversarial multi-lingual neural relation extraction (AMNRE) model, which builds both consistent and individual representations for each sentence to consider the consistency and diversity among languages.

Question Answering Relation Extraction

Incorporating Relation Paths in Neural Relation Extraction

1 code implementation EMNLP 2017 Wenyuan Zeng, Yankai Lin, Zhiyuan Liu, Maosong Sun

Distantly supervised relation extraction has been widely used to find novel relational facts from plain text.

Relation Extraction

Cannot find the paper you are looking for? You can Submit a new open access paper.