However, the popular OpenIE systems usually output facts sequentially in the way of predicting the next fact conditioned on the previous decoded ones, which enforce an unnecessary order on the facts and involve the error accumulation between autoregressive steps.
Chinese pre-trained language models usually exploit contextual character information to learn representations, while ignoring the linguistics knowledge, e. g., word and sentence information.
Based on this observation, we further sparsify delta parameters of multiple SFT homologous models with DARE and subsequently merge them into a single model by parameter averaging.
Given a textual passage and an answer, humans are able to ask questions with various expressions, but this ability is still challenging for most question generation (QG) systems.
Previous studies have suggested that key phrase selection is essential for question generation (QG), yet it is still challenging to connect such disjointed phrases into meaningful questions, particularly for long context.
We empirically assess the proposed approach on a variety of datasets and find significant improvement, compared to alternative approaches, in identifying training inputs that improve a model's disparity metric.
2 code implementations • 28 Sep 2023 • Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng Xu, Jin Xu, An Yang, Hao Yang, Jian Yang, Shusheng Yang, Yang Yao, Bowen Yu, Hongyi Yuan, Zheng Yuan, Jianwei Zhang, Xingxuan Zhang, Yichang Zhang, Zhenru Zhang, Chang Zhou, Jingren Zhou, Xiaohuan Zhou, Tianhang Zhu
Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans.
By adjusting the number of added nodes, we can control the difficulty level in the modified instruction data.
1 code implementation • 12 Jul 2023 • Xiangpeng Wei, Haoran Wei, Huan Lin, TianHao Li, Pei Zhang, Xingzhang Ren, Mei Li, Yu Wan, Zhiwei Cao, Binbin Xie, Tianxiang Hu, Shangjie Li, Binyuan Hui, Bowen Yu, Dayiheng Liu, Baosong Yang, Fei Huang, Jun Xie
Large language models (LLMs) demonstrate remarkable ability to comprehend, reason, and generate following nature language instructions.
In this paper, we propose Preference Ranking Optimization (PRO) as an alternative to PPO for directly aligning LLMs with the Bradley-Terry comparison.
When trying to answer complex questions, people often rely on multiple sources of information, such as visual, textual, and tabular data.
To tackle this issue, we are the first to present a causally-complete dataset construction strategy for building million-level DocGD pre-training corpora.
Specifically, we dedicate task-level prompts to capture task-specific knowledge to retain high LL performances and maintain instance-level prompts to learn knowledge shared across input samples to improve the model's generalization performance.
Recently developed graph contrastive learning (GCL) approaches compare two different "views" of the same graph in order to learn node/graph representations.
Open Information Extraction (OpenIE) facilitates the open-domain discovery of textual facts.
In this paper, we explore a novel setting, semi-supervised lifelong language learning (SSLL), where a model learns sequentially arriving language tasks with both labeled and unlabeled data.
Lifelong learning (LL) is vital for advanced task-oriented dialogue (ToD) systems.
In this paper, we propose a Layout-aware document-level Information Extraction dataset, LIE, to facilitate the study of extracting both structural and semantic knowledge from visually rich documents (VRDs), so as to generate accurate responses in dialogue systems.
To instantiate this strategy, we further propose a model, RelATE, which builds a dual-level attention to aggregate relationrelevant information to detect the relation occurrence and utilizes the annotated samples of the detected relations to extract the corresponding head/tail entities.
Open Information Extraction (OpenIE) facilitates domain-independent discovery of relational facts from large corpora.
Ranked #1 on Open Information Extraction on CaRB
With the rapid growth of the number and types of web resources, there are still problems to be solved when using a single strategy to extract the text information of different pages.
Meanwhile, rough reading is explored in a multi-round manner to discover undetected events, thus the multi-events problem is handled.
Distantly supervised named entity recognition (DS-NER) efficiently reduces labor costs but meanwhile intrinsically suffers from the label noise due to the strong assumption of distant supervision.
Event extraction (EE) is a crucial information extraction task that aims to extract event information in texts.
Named entity recognition (NER) remains challenging when entity mentions can be discontinuous.
A heterogeneous graph attention networks is then introduced to propagate relational message and enrich information interaction.
Dependency trees have been shown to be effective in capturing long-range relations between target entities.
Document-level relation extraction (RE) poses new challenges over its sentence-level counterpart since it requires an adequate comprehension of the whole document and the multi-hop reasoning ability across multiple sentences to reach the final result.
To mitigate the issue, we propose in this paper a one-stage joint extraction model, namely, TPLinker, which is capable of discovering overlapping relations sharing one or both entities while immune from the exposure bias.
Ranked #2 on Relation Extraction on NYT11-HRL
More recently, Named Entity Recognition hasachieved great advances aided by pre-trainingapproaches such as BERT.
We first build a representation extractor to derive features for unlabeled data from the target domain (no test data is necessary) and then group them with a cluster miner.
Previous studies on the task have verified the effectiveness of integrating syntactic dependency into graph convolutional networks.
On Wikipedia, sophisticated algorithmic tools are used to assess the quality of edits and take corrective actions.
Most Chinese pre-trained models take character as the basic unit and learn representation according to character's external contexts, ignoring the semantics expressed in the word, which is the smallest meaningful utterance in Chinese.
Joint extraction of entities and relations aims to detect entity pairs along with their relations using a single model.
Ranked #1 on Relation Extraction on NYT-single
Relation extraction studies the issue of predicting semantic relations between pairs of entities in sentences.
Ranked #29 on Relation Extraction on TACRED
Dataflow visualization systems enable flexible visual data exploration by allowing the user to construct a dataflow diagram that composes query and visualization modules to specify system functionality.
In this paper, we present Visus, a system designed to support the model building process and curation of ML data processing pipelines generated by AutoML systems.
Although information workers may complain about meetings, they are an essential part of their work life.