no code implementations • 8 Jun 2023 • Qiaoyu Tang, Ziliang Deng, Hongyu Lin, Xianpei Han, Qiao Liang, Le Sun
This validation supports the notion that learning generalized tool-use abilities is feasible for compact language models.
1 code implementation • 18 May 2023 • Jiawei Chen, Yaojie Lu, Hongyu Lin, Jie Lou, Wei Jia, Dai Dai, Hua Wu, Boxi Cao, Xianpei Han, Le Sun
M}$, and a new entity extractor can be implicitly constructed by applying new instruction and demonstrations to PLMs, i. e., $\mathcal{ (\lambda .
no code implementations • 16 May 2023 • Ruoxi Xu, Hongyu Lin, Xinyan Guan, Xianpei Han, Yingfei Sun, Le Sun
Understanding documents is central to many real-world tasks but remains a challenging topic.
no code implementations • 16 May 2023 • Boxi Cao, Qiaoyu Tang, Hongyu Lin, Xianpei Han, Jiawei Chen, Tianshu Wang, Le Sun
Memory is one of the most essential cognitive functions serving as a repository of world knowledge and episodes of activities.
1 code implementation • 12 May 2023 • Jialong Tang, Hongyu Lin, Zhuoqun Li, Yaojie Lu, Xianpei Han, Le Sun
Event schema provides a conceptual, structural and formal language to represent events and model the world event knowledge.
no code implementations • 8 May 2023 • Ning Bian, Peilin Liu, Xianpei Han, Hongyu Lin, Yaojie Lu, Ben He, Le Sun
Large language models (LLMs) have gained increasing prominence in artificial intelligence, making a profound impact on society and various industries like business and science.
no code implementations • 3 May 2023 • Xuanang Chen, Ben He, Zheng Ye, Le Sun, Yingfei Sun
Additionally, current methods rely heavily on the use of a well-imitated surrogate NRM to guarantee the attack effect, which makes them difficult to use in practice.
1 code implementation • 3 May 2023 • Xiaoyang Chen, Yanjiang Liu, Ben He, Le Sun, Yingfei Sun
The Differentiable Search Index (DSI) is a novel information retrieval (IR) framework that utilizes a differentiable function to generate a sorted list of document identifiers in response to a given query.
no code implementations • 29 Mar 2023 • Ning Bian, Xianpei Han, Le Sun, Hongyu Lin, Yaojie Lu, Ben He
(4) Can GPTs effectively leverage commonsense for answering questions?
1 code implementation • 14 Mar 2023 • Boxi Cao, Hongyu Lin, Xianpei Han, Le Sun
Knowledge plays a critical role in artificial intelligence.
no code implementations • 19 Jan 2023 • Shan Wu, Chunlei Xin, Bo Chen, Xianpei Han, Le Sun
Since the meaning representations are detailed and accurate annotations which express fine-grained sequence-level semtantics, it is usually hard to train discriminative semantic parsers via Maximum Likelihood Estimation (MLE) in an autoregressive fashion.
no code implementations • 9 Jan 2023 • Jie Lou, Yaojie Lu, Dai Dai, Wei Jia, Hongyu Lin, Xianpei Han, Le Sun, Hua Wu
Based on this paradigm, we propose to universally model various IE tasks with Unified Semantic Matching (USM) framework, which introduces three unified token linking operations to model the abilities of structuring and conceptualizing.
no code implementations • 12 May 2022 • Tianshu Wang, Hongyu Lin, Cheng Fu, Xianpei Han, Le Sun, Feiyu Xiong, Hui Chen, Minlong Lu, Xiuwen Zhu
Experimental results demonstrate that the assumptions made in the previous benchmark construction process are not coincidental with the open environment, which conceal the main challenges of the task and therefore significantly overestimate the current progress of entity matching.
no code implementations • 9 May 2022 • Ying Zhou, Xuanang Chen, Ben He, Zheng Ye, Le Sun
Knowledge graph completion (KGC) aims to infer missing knowledge triples based on known facts in a knowledge graph.
1 code implementation • 25 Apr 2022 • Xiaoyang Chen, Ben He, Le Sun
While large-scale pre-trained language models like BERT have advanced the state-of-the-art in IR, its application in query performance prediction (QPP) is so far based on pointwise modeling of individual queries.
1 code implementation • ACL 2022 • Jiawei Chen, Qing Liu, Hongyu Lin, Xianpei Han, Le Sun
In this paper, we propose a self-describing mechanism for few-shot NER, which can effectively leverage illustrative instances and precisely transfer knowledge from external resources by describing both entity types and mentions using a universal concept set.
1 code implementation • ACL 2022 • Boxi Cao, Hongyu Lin, Xianpei Han, Fangchao Liu, Le Sun
Prompt-based probing has been widely used in evaluating the abilities of pretrained language models (PLMs).
no code implementations • Findings (ACL) 2022 • Ruoxi Xu, Hongyu Lin, Meng Liao, Xianpei Han, Jin Xu, Wei Tan, Yingfei Sun, Le Sun
Events are considered as the fundamental building blocks of the world.
1 code implementation • ACL 2022 • Fangchao Liu, Hongyu Lin, Xianpei Han, Boxi Cao, Le Sun
Low-shot relation extraction~(RE) aims to recognize novel relations with very few or even no samples, which is critical in real scenario application.
1 code implementation • ACL 2022 • Yaojie Lu, Qing Liu, Dai Dai, Xinyan Xiao, Hongyu Lin, Xianpei Han, Le Sun, Hua Wu
Information extraction suffers from its varying targets, heterogeneous structures, and demand-specific schemas.
Ranked #4 on
Aspect-Based Sentiment Analysis (ABSA)
on ASTE
(using extra training data)
no code implementations • 15 Mar 2022 • Jialong Tang, Hongyu Lin, Meng Liao, Yaojie Lu, Xianpei Han, Le Sun, Weijian Xie, Jin Xu
In this paper, we propose a new \textbf{scene-wise} paradigm for procedural text understanding, which jointly tracks states of all entities in a scene-by-scene manner.
1 code implementation • EMNLP 2021 • Lingyong Yan, Xianpei Han, Le Sun
Bootstrapping has become the mainstream method for entity set expansion.
no code implementations • EMNLP 2021 • Qing Liu, Hongyu Lin, Xinyan Xiao, Xianpei Han, Le Sun, Hua Wu
Conventional entity typing approaches are based on independent classification paradigms, which make them difficult to recognize inter-dependent, long-tailed and fine-grained entity types.
Ranked #7 on
Entity Typing
on Open Entity
1 code implementation • EMNLP 2021 • Jiawei Chen, Hongyu Lin, Xianpei Han, Le Sun
In this paper, we identify and solve the trigger curse problem in few-shot event detection (FSED) from a causal view.
no code implementations • 19 Jul 2021 • Ning Bian, Xianpei Han, Bo Chen, Hongyu Lin, Ben He, Le Sun
In this paper, we propose a new framework for unsupervised MRC.
1 code implementation • ACL 2021 • Yaojie Lu, Hongyu Lin, Jin Xu, Xianpei Han, Jialong Tang, Annan Li, Le Sun, Meng Liao, Shaoyi Chen
Event extraction is challenging due to the complex structure of event records and the semantic gap between text and event.
Ranked #3 on
Event Extraction
on ACE2005
1 code implementation • ACL 2021 • Wenkai Zhang, Hongyu Lin, Xianpei Han, Le Sun
Distant supervision tackles the data bottleneck in NER by automatically generating training instances via dictionary matching.
1 code implementation • 17 Jun 2021 • Wenkai Zhang, Hongyu Lin, Xianpei Han, Le Sun, Huidan Liu, Zhicheng Wei, Nicholas Jing Yuan
Specifically, during neural network training, we naturally model the noise samples in each batch following a hypergeometric distribution parameterized by the noise-rate.
no code implementations • ACL 2021 • Fangchao Liu, Lingyong Yan, Hongyu Lin, Xianpei Han, Le Sun
Open relation extraction aims to cluster relation instances referring to the same underlying relation, which is a critical step for general relation extraction.
1 code implementation • ACL 2021 • Boxi Cao, Hongyu Lin, Xianpei Han, Le Sun, Lingyong Yan, Meng Liao, Tong Xue, Jin Xu
Previous literatures show that pre-trained masked language models (MLMs) such as BERT can achieve competitive factual knowledge extraction performance on some datasets, indicating that MLMs can potentially be a reliable knowledge source.
1 code implementation • ACL 2021 • Jialong Tang, Hongyu Lin, Meng Liao, Yaojie Lu, Xianpei Han, Le Sun, Weijian Xie, Jin Xu
Current event-centric knowledge graphs highly rely on explicit connectives to mine relations between events.
no code implementations • ACL 2021 • Shan Wu, Bo Chen, Chunlei Xin, Xianpei Han, Le Sun, Weipeng Zhang, Jiansong Chen, Fan Yang, Xunliang Cai
During synchronous decoding: the utterance paraphrasing is constrained by the structure of the logical form, therefore the canonical utterance can be paraphrased controlledly; the semantic decoding is guided by the semantics of the canonical utterance, therefore its logical form can be generated unsupervisedly.
no code implementations • 17 Apr 2021 • Xiaoyang Chen, Kai Hui, Ben He, Xianpei Han, Le Sun, Zheng Ye
BERT-based text ranking models have dramatically advanced the state-of-the-art in ad-hoc retrieval, wherein most models tend to consider individual query-document pairs independently.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Jialong Tang, Yaojie Lu, Hongyu Lin, Xianpei Han, Le Sun, Xinyan Xiao, Hua Wu
One of the biggest bottlenecks in building accurate, high coverage neural open IE systems is the need for large labelled corpora.
1 code implementation • 5 Mar 2021 • Jinsong Su, Jialong Tang, Hui Jiang, Ziyao Lu, Yubin Ge, Linfeng Song, Deyi Xiong, Le Sun, Jiebo Luo
In aspect-based sentiment analysis (ABSA), many neural models are equipped with an attention mechanism to quantify the contribution of each context word to sentiment prediction.
no code implementations • 4 Jan 2021 • Ning Bian, Xianpei Han, Bo Chen, Le Sun
Experiments show that: (1) Our knowledge-to-text framework is effective and achieves state-of-the-art performance on CommonsenseQA dataset, providing a simple and strong knowledge-enhanced baseline for CQA; (2) The potential of knowledge is still far from being fully exploited in CQA -- there is a significant performance gap from current models to our models with golden knowledge; and (3) Context-sensitive knowledge selection, heterogeneous knowledge exploitation, and commonsense-rich language models are promising CQA directions.
1 code implementation • 8 Dec 2020 • Lingyong Yan, Xianpei Han, Le Sun, Fangchao Liu, Ning Bian
By re-organizing all sentences about an entity as a document and extracting relations via querying the document with relation-specific questions, the document-based DS paradigm can simultaneously encode and exploit all sentence-level, inter-sentence-level, and entity-level evidence.
Ranked #1 on
Relationship Extraction (Distant Supervised)
on NYT
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Lingyong Yan, Xianpei Han, Ben He, Le Sun
Bootstrapping for entity set expansion (ESE) has been studied for a long period, which expands new entities using only a few seed entities as supervision.
1 code implementation • SEMEVAL 2020 • Yaojie Lu, Annan Li, Hongyu Lin, Xianpei Han, Le Sun
ISCAS participated in two subtasks of SemEval 2020 Task 5: detecting counterfactual statements and detecting antecedent and consequence.
1 code implementation • 17 Sep 2020 • Yaojie Lu, Hongyu Lin, Jialong Tang, Xianpei Han, Le Sun
Traditional event coreference systems usually rely on pipeline framework and hand-crafted features, which often face error propagation problem and have poor generalization ability.
2 code implementations • 16 Sep 2020 • Xuanang Chen, Ben He, Kai Hui, Le Sun, Yingfei Sun
Despite the effectiveness of utilizing the BERT model for document ranking, the high computational cost of such approaches limits their uses.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Zhi Zheng, Kai Hui, Ben He, Xianpei Han, Le Sun, Andrew Yates
Query expansion aims to mitigate the mismatch between the language used in a query and in a document.
no code implementations • EMNLP 2020 • Hongyu Lin, Yaojie Lu, Jialong Tang, Xianpei Han, Le Sun, Zhicheng Wei, Nicholas Jing Yuan
Specifically, we erase name regularity, mention coverage and context diversity respectively from the benchmarks, in order to explore their impact on the generalization ability of models.
no code implementations • IJCNLP 2019 • Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun, Bin Dong, Shanshan Jiang
Current region-based NER models only rely on fully-annotated training data to learn effective region encoder, which often face the training data bottleneck.
no code implementations • IJCNLP 2019 • Lingyong Yan, Xianpei Han, Le Sun, Ben He
Bootstrapping for Entity Set Expansion (ESE) aims at iteratively acquiring new instances of a specific target category.
no code implementations • IJCNLP 2019 • Bo An, Chen Bo, Xianpei Han, Le Sun
Semantic parsing aims to map natural language utterances into structured meaning representations.
1 code implementation • ACL 2019 • Yaojie Lu, Hongyu Lin, Xianpei Han, Le Sun
Event detection systems rely on discrimination knowledge to distinguish ambiguous trigger words and generalization knowledge to detect unseen/sparse trigger words.
1 code implementation • ACL 2019 • Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun
In supervised event detection, most of the mislabeling occurs between a small number of confusing type pairs, including trigger-NIL pairs and sibling sub-types of the same coarse type.
1 code implementation • ACL 2019 • Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun
In this paper, we propose to resolve this problem by modeling and leveraging the head-driven phrase structures of entity mentions, i. e., although a mention can nest other mentions, they will not share the same head word.
Ranked #7 on
Nested Mention Recognition
on ACE 2005
1 code implementation • ACL 2019 • Jialong Tang, Ziyao Lu, Jinsong Su, Yubin Ge, Linfeng Song, Le Sun, Jiebo Luo
In aspect-level sentiment classification (ASC), it is prevalent to equip dominant neural models with attention mechanisms, for the sake of acquiring the importance of each context word on the given aspect.
Aspect-Based Sentiment Analysis (ABSA)
Sentiment Classification
no code implementations • ACL 2016 • Bo Chen, Le Sun, Xianpei Han, Bo An
A major challenge of semantic parsing is the vocabulary mismatch problem between natural language and target ontology.
1 code implementation • EMNLP 2018 • Canjia Li, Yingfei Sun, Ben He, Le Wang, Kai Hui, Andrew Yates, Le Sun, Jungang Xu
Pseudo-relevance feedback (PRF) is commonly used to boost the performance of traditional information retrieval (IR) models by using top-ranked documents to identify and weight new query terms, thereby reducing the effect of query-document vocabulary mismatches.
Ranked #9 on
Ad-Hoc Information Retrieval
on TREC Robust04
1 code implementation • ACL 2018 • Bo Chen, Le Sun, Xianpei Han
This paper proposes a neural semantic parsing approach -- Sequence-to-Action, which models semantic parsing as an end-to-end semantic graph generation process.
no code implementations • COLING 2018 • Bo Chen, Bo An, Le Sun, Xianpei Han
Semantic parsers critically rely on accurate and high-coverage lexicons.
no code implementations • COLING 2018 • Bo An, Xianpei Han, Le Sun
Word composition is a promising technique for representation learning of large linguistic units (e. g., phrases, sentences and documents).
no code implementations • ACL 2018 • Cancan Jin, Ben He, Kai Hui, Le Sun
Existing automated essay scoring (AES) models rely on rated essays for the target prompt as training data.
no code implementations • NAACL 2018 • Bo An, Bo Chen, Xianpei Han, Le Sun
Previous representation learning techniques for knowledge graph representation usually represent the same entity or relation in different triples with the same representation, without considering the ambiguity of relations and entities.
1 code implementation • ACL 2018 • Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun
This paper focuses on detection tasks in information extraction, where positive instances are sparsely distributed and models are usually evaluated using F-measure on positive classes.
1 code implementation • ACL 2018 • Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun
Neural network based models commonly regard event detection as a word-wise classification task, which suffer from the mismatch problem between words and event triggers, especially in languages without natural word delimiters such as Chinese.
no code implementations • WS 2017 • Kees van Deemter, Le Sun, Rint Sybesma, Xiao Li, Bo Chen, Muyun Yang
East Asian languages are thought to handle reference differently from languages such as English, particularly in terms of the marking of definiteness and number.
no code implementations • EMNLP 2017 • Hongyu Lin, Le Sun, Xianpei Han
Then we propose a multi-knowledge reasoning model, which selects inference rules for a specific reasoning context using attention mechanism, and reasons by summarizing all valid inference rules.
no code implementations • COLING 2016 • Xianpei Han, Le Sun
Firstly, these methods are mostly context-insensitive, cannot accurately measure the similarity between two predicates in a specific context.
no code implementations • 20 Jul 2016 • Jiangang Ma, Le Sun, Hua Wang, Yanchun Zhang, Uwe Aickelin
Uncertain data streams have been widely generated in many Web applications.