no code implementations • EMNLP 2020 • Jizhi Tang, Yansong Feng, Dongyan Zhao
Recent efforts have made great progress to track multiple entities in a procedural text, but usually treat each entity separately and ignore the fact that there are often multiple entities interacting with each other during one process, some of which are even explicitly mentioned.
no code implementations • NAACL 2022 • Nan Hu, Zirui Wu, Yuxuan Lai, Xiao Liu, Yansong Feng
Different from previous fact extraction and verification tasks that only consider evidence of a single format, FEVEROUS brings further challenges by extending the evidence format to both plain text and tables.
1 code implementation • 31 Mar 2025 • Qin Wang, Youhuan Li, Yansong Feng, Si Chen, Ziming Li, Pan Zhang, Zhichao Shi, Yuequn Dou, chuchu Gao, Zebin Huang, Zihui Si, Yixuan Chen, Zhaohai Sun, Ke Tang, Wenqiang Jin
In this paper, we propose SchemaAgent, a unified LLM-based multi-agent framework for the automated generation of high-quality database schema.
1 code implementation • 4 Mar 2025 • Zirui Wu, Xiao Liu, Jiayi Li, Lingpeng Kong, Yansong Feng
While Large Language Model-based agents have demonstrated substantial progress in task completion, existing evaluation benchmarks tend to overemphasize single-task performance, with insufficient attention given to the crucial aspects of multitask planning and execution efficiency required in real-world scenarios.
3 code implementations • 3 Mar 2025 • Chen Zhang, Mingxu Tao, Zhiyuan Liao, Yansong Feng
Large language models (LLMs) excel in high-resource languages but struggle with low-resource languages (LRLs), particularly those spoken by minority communities in China, such as Tibetan, Uyghur, Kazakh, and Mongolian.
no code implementations • 3 Mar 2025 • Xiao Liu, Zirui Wu, Jiayi Li, Zhicheng Shao, Xun Pang, Yansong Feng
Longitudinal network data are essential for analyzing political, economic, and social systems and processes.
1 code implementation • 24 Feb 2025 • Huanghai Liu, Quzhe Huang, Qingjing Chen, Yiran Hu, Jiayu Ma, Yun Liu, Weixing Shen, Yansong Feng
The Four-Element Theory is a fundamental framework in criminal law, defining the constitution of crime through four dimensions: Subject, Object, Subjective aspect, and Objective aspect.
no code implementations • 3 Jan 2025 • Kangcheng Luo, Quzhe Huang, Cong Jiang, Yansong Feng
Multi-faceted evaluations by legal experts indicate that the quality of our concept interpretations is comparable to those written by human experts.
no code implementations • 14 Aug 2024 • Yutong Hu, Quzhe Huang, Yansong Feng
Event Temporal Relation Extraction (ETRE) aims to identify the temporal relationship between two events, which plays an important role in natural language understanding.
1 code implementation • 13 Aug 2024 • Yutong Hu, Kangcheng Luo, Yansong Feng
ELLA also retrieves relevant legal cases for user reference.
no code implementations • 10 Aug 2024 • Jiuheng Lin, Yuxuan Lai, Yansong Feng
Conditional question answering (CQA) is an important task that aims to find probable answers and identify missing conditions.
no code implementations • 4 Jul 2024 • Mingxu Tao, Chen Zhang, Quzhe Huang, Tianyao Ma, Songfang Huang, Dongyan Zhao, Yansong Feng
Adapting large language models (LLMs) to new languages typically involves continual pre-training (CT) followed by supervised fine-tuning (SFT).
no code implementations • 20 Jun 2024 • Jingcong Liang, Junlong Wang, Xinyu Zhai, Yungui Zhuang, Yiyang Zheng, Xin Xu, Xiandong Ran, Xiaozheng Dong, Honghui Rong, Yanlun Liu, Hao Chen, Yuhan Wei, Donghai Li, Jiajie Peng, Xuanjing Huang, Chongde Shi, Yansong Feng, Yun Song, Zhongyu Wei
We give a detailed overview of the CAIL 2023 Argument Mining Track, one of the Chinese AI and Law Challenge (CAIL) 2023 tracks.
no code implementations • 17 Jun 2024 • Yutong Hu, Quzhe Huang, Kangcheng Luo, Yansong Feng
In this paper, we aim to explore which kinds of words benefit more from long contexts in language models.
1 code implementation • 7 Jun 2024 • Chengang Hu, Xiao Liu, Yansong Feng
To better investigate compositional generalization with more linguistic phenomena and compositional diversity, we propose the DIsh NamE Recognition (DiNeR) task and create a large realistic Chinese dataset.
no code implementations • 9 May 2024 • Yutong Hu, Quzhe Huang, Mingxu Tao, Chen Zhang, Yansong Feng
Recent studies have shown that Large Language Models (LLMs) have the potential to process extremely long text.
1 code implementation • 20 Mar 2024 • Kunhang Li, Yansong Feng
The task of text2motion is to generate human motion sequences from given textual descriptions, where the model explores diverse mappings from natural language instructions to human body movements.
1 code implementation • 12 Mar 2024 • Quzhe Huang, Zhenwei An, Nan Zhuang, Mingxu Tao, Chen Zhang, Yang Jin, Kun Xu, Liwei Chen, Songfang Huang, Yansong Feng
In this paper, we introduce a novel dynamic expert selection framework for Mixture of Experts (MoE) models, aiming to enhance computational efficiency and model performance by adjusting the number of activated experts based on input difficulty.
1 code implementation • 4 Mar 2024 • Zirui Wu, Yansong Feng
Our work underscores the importance of the planning and reasoning abilities towards a model over tabular tasks with generalizability and interpretability.
1 code implementation • 29 Feb 2024 • Chen Zhang, Xiao Liu, Jiuheng Lin, Yansong Feng
We introduce DiPMT++, a framework for adapting LLMs to unseen languages by in-context learning.
1 code implementation • 27 Feb 2024 • Mingxu Tao, Quzhe Huang, Kun Xu, Liwei Chen, Yansong Feng, Dongyan Zhao
The advancement of Multimodal Large Language Models (MLLMs) has greatly accelerated the development of applications in understanding integrated texts and images.
1 code implementation • 27 Feb 2024 • Xiao Liu, Zirui Wu, Xueqing Wu, Pan Lu, Kai-Wei Chang, Yansong Feng
To address this gap, we introduce the Quantitative Reasoning with Data (QRData) benchmark, aiming to evaluate Large Language Models' capability in statistical and causal reasoning with real-world data.
2 code implementations • 26 Feb 2024 • Mingxu Tao, Dongyan Zhao, Yansong Feng
Open-ended question answering requires models to find appropriate evidence to form wellreasoned, comprehensive and helpful answers.
1 code implementation • 10 Jan 2024 • Xiao Liu, Yansong Feng, Kai-Wei Chang
Motivated by the definition of probability of sufficiency (PS) in the causal literature, we proposeCASA, a zero-shot causality-driven argument sufficiency assessment framework.
2 code implementations • 14 Nov 2023 • Chen Zhang, Mingxu Tao, Quzhe Huang, Jiuheng Lin, Zhibin Chen, Yansong Feng
To address this accessibility challenge, we present MC$^2$, a Multilingual Corpus of Minority Languages in China, which is the largest open-source corpus of its kind so far.
1 code implementation • 7 Jun 2023 • Zhibin Chen, Yansong Feng, Dongyan Zhao
Entailment Graphs (EGs) have been constructed based on extracted corpora as a strong and explainable form to indicate context-independent entailment relations in natural languages.
1 code implementation • 1 Jun 2023 • Chen Zhang, Jiuheng Lin, Xiao Liu, Yuxuan Lai, Yansong Feng, Dongyan Zhao
We further analyze how well different paradigms of current multi-answer MRC models deal with different types of multi-answer instances.
1 code implementation • 30 May 2023 • Xiao Liu, Da Yin, Chen Zhang, Yansong Feng, Dongyan Zhao
Causal reasoning, the ability to identify cause-and-effect relationship, is crucial in human thinking.
no code implementations • 28 May 2023 • Quzhe Huang, Yutong Hu, Shengqi Zhu, Yansong Feng, Chang Liu, Dongyan Zhao
After examining the relation definitions in various ETRE tasks, we observe that all relations can be interpreted using the start and end time points of events.
1 code implementation • 24 May 2023 • Quzhe Huang, Mingxu Tao, Chen Zhang, Zhenwei An, Cong Jiang, Zhibin Chen, Zirui Wu, Yansong Feng
Specifically, we inject domain knowledge during the continual training stage and teach the model to learn professional skills using properly designed supervised fine-tuning tasks.
no code implementations • 8 May 2023 • Mingxu Tao, Yansong Feng, Dongyan Zhao
Since the embeddings of rear positions are updated fewer times than the front position embeddings, the rear ones may not be properly trained.
1 code implementation • 2 Mar 2023 • Mingxu Tao, Yansong Feng, Dongyan Zhao
Large pre-trained language models help to achieve state of the art on a variety of natural language processing (NLP) tasks, nevertheless, they still suffer from forgetting when incrementally learning a sequence of tasks.
1 code implementation • 26 Feb 2023 • Chen Zhang, Yuxuan Lai, Yansong Feng, Xingyu Shen, Haowei Du, Dongyan Zhao
We convert KB subgraphs into passages to narrow the gap between KB schemas and questions, which enables our model to benefit from recent advances in multilingual pre-trained language models (MPLMs) and cross-lingual machine reading comprehension (xMRC).
Cross-Lingual Question Answering
Machine Reading Comprehension
1 code implementation • 31 Oct 2022 • Zhenwei An, Quzhe Huang, Cong Jiang, Yansong Feng, Dongyan Zhao
The charge prediction task aims to predict the charge for a case given its fact description.
1 code implementation • 20 Oct 2022 • Xiao Liu, Yansong Feng, Jizhi Tang, Chengang Hu, Dongyan Zhao
Although pretrained language models can generate fluent recipe texts, they fail to truly learn and use the culinary knowledge in a compositional way.
no code implementations • Findings (NAACL) 2022 • Yifei Zhou, Yansong Feng
Recent works show that discourse analysis benefits from modeling intra- and inter-sentential levels separately, where proper representations for text units of different granularities are desired to capture both the meaning of text units and their relations to the context.
1 code implementation • ACL 2022 • Quzhe Huang, Shibo Hao, Yuan Ye, Shengqi Zhu, Yansong Feng, Dongyan Zhao
DocRED is a widely used dataset for document-level relation extraction.
1 code implementation • ACL 2022 • Zhibin Chen, Yansong Feng, Dongyan Zhao
Typed entailment graphs try to learn the entailment relations between predicates from text and model them as edges between predicate nodes.
1 code implementation • ACL 2022 • Xiao Liu, Da Yin, Yansong Feng, Dongyan Zhao
We probe PLMs and models with visual signals, including vision-language pretrained models and image synthesis models, on this benchmark, and find that image synthesis models are more capable of learning accurate and consistent spatial knowledge than other models.
1 code implementation • Findings (EMNLP) 2021 • Chen Zhang, Yuxuan Lai, Yansong Feng, Dongyan Zhao
In this paper, we present a new verification style reading comprehension dataset named VGaokao from Chinese Language tests of Gaokao.
1 code implementation • ACL 2021 • Quzhe Huang, Shengqi Zhu, Yansong Feng, Yuan Ye, Yuxuan Lai, Dongyan Zhao
Document-level Relation Extraction (RE) is a more challenging task than sentence RE as it often requires reasoning over multiple sentences.
Ranked #48 on
Relation Extraction
on DocRED
no code implementations • ACL 2021 • Quzhe Huang, Shengqi Zhu, Yansong Feng, Dongyan Zhao
Recent studies strive to incorporate various human rationales into neural networks to improve model performance, but few pay attention to the quality of the rationales.
1 code implementation • Findings (ACL) 2021 • Yuxuan Lai, Chen Zhang, Yansong Feng, Quzhe Huang, Dongyan Zhao
A thorough empirical analysis shows that MRC models tend to learn shortcut questions earlier than challenging questions, and the high proportions of shortcut questions in training sets hinder models from exploring the sophisticated reasoning skills in the later stage of training.
no code implementations • NAACL 2021 • Chongyang Tao, Shen Gao, Juntao Li, Yansong Feng, Dongyan Zhao, Rui Yan
Sequential information, a. k. a., orders, is assumed to be essential for processing a sequence with recurrent neural network or convolutional neural network based encoders.
1 code implementation • NAACL 2021 • Xiao Liu, Da Yin, Yansong Feng, Yuting Wu, Dongyan Zhao
Causal inference is the process of capturing cause-effect relationship among variables.
2 code implementations • NAACL 2021 • Yuxuan Lai, Yijia Liu, Yansong Feng, Songfang Huang, Dongyan Zhao
Further analysis shows that Lattice-BERT can harness the lattice structures, and the improvement comes from the exploration of redundant information and multi-granularity representations.
1 code implementation • NeurIPS 2020 • Yao Fu, Chuanqi Tan, Bin Bi, Mosha Chen, Yansong Feng, Alexander M. Rush
Learning to control the structure of sentences is a challenging problem in text generation.
1 code implementation • COLING 2020 • Yuxi Xie, Liangming Pan, Dongzhe Wang, Min-Yen Kan, Yansong Feng
Recent question generation (QG) approaches often utilize the sequence-to-sequence framework (Seq2Seq) to optimize the log-likelihood of ground-truth questions using teacher forcing.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Xiaohan Yu, Quzhe Huang, Zheng Wang, Yansong Feng, Dongyan Zhao
Code comments are vital for software maintenance and comprehension, but many software projects suffer from the lack of meaningful and up-to-date comments in practice.
no code implementations • 23 Jun 2020 • Zechang Li, Yuxuan Lai, Yansong Feng, Dongyan Zhao
In this paper, we propose a novel semantic parser for domain adaptation, where we have much fewer annotated data in the target domain compared to the source domain.
1 code implementation • ACL 2020 • Yuting Wu, Xiao Liu, Yansong Feng, Zheng Wang, Dongyan Zhao
This paper presents Neighborhood Matching Network (NMN), a novel entity alignment framework for tackling the structural heterogeneity challenge.
1 code implementation • ACL 2020 • Liangming Pan, Yuxi Xie, Yansong Feng, Tat-Seng Chua, Min-Yen Kan
This paper proposes the problem of Deep Question Generation (DQG), which aims to generate complex questions that require reasoning over multiple pieces of information of the input passage.
no code implementations • 23 Jan 2020 • Kun Xu, Linfeng Song, Yansong Feng, Yan Song, Dong Yu
Existing entity alignment methods mainly vary on the choices of encoding the knowledge graph, but they typically use the same decoding method, which independently chooses the local optimal match for each source entity.
2 code implementations • NeurIPS 2019 • Yao Fu, Yansong Feng, John P. Cunningham
Inspired by variational autoencoders with discrete latent structures, in this work, we propose a latent bag of words (BOW) model for paraphrase generation.
1 code implementation • 26 Nov 2019 • Yuan Ye, Yansong Feng, Bingfeng Luo, Yuxuan Lai, Dongyan Zhao
However, such models often make predictions for each entity pair individually, thus often fail to solve the inconsistency among different predictions, which can be characterized by discrete relation constraints.
no code implementations • IJCNLP 2019 • Jizhi Tang, Yansong Feng, Dongyan Zhao
News streams contain rich up-to-date information which can be used to update knowledge graphs (KGs).
no code implementations • IJCNLP 2019 • Shuai Ma, Gang Wang, Yansong Feng, Jinpeng Huai
Many existing relation extraction (RE) models make decisions globally using integer linear programming (ILP).
no code implementations • IJCNLP 2019 • Jia Li, Chongyang Tao, Wei Wu, Yansong Feng, Dongyan Zhao, Rui Yan
We study how to sample negative examples to automatically construct a training set for effective model learning in retrieval-based dialogue systems.
1 code implementation • IJCNLP 2019 • Yuting Wu, Xiao Liu, Yansong Feng, Zheng Wang, Dongyan Zhao
Entity alignment is a viable means for integrating heterogeneous knowledge among different knowledge graphs (KGs).
Ranked #18 on
Entity Alignment
on DBP15k zh-en
(using extra training data)
1 code implementation • 2 Sep 2019 • Zechang Li, Yuxuan Lai, Yuxi Xie, Yansong Feng, Dongyan Zhao
The sketch is a high-level structure of the logical form exclusive of low-level details such as entities and predicates.
1 code implementation • IJCNLP 2019 • Zhichao Yang, Pengshan Cai, Yansong Feng, Fei Li, Weijiang Feng, Elena Suet-Ying Chiu, Hong Yu
According to experiments, our approach significantly improve the perplexity and BLEU compared with typical UMT models.
Cultural Vocal Bursts Intensity Prediction
Reinforcement Learning
+3
1 code implementation • 22 Aug 2019 • Yuting Wu, Xiao Liu, Yansong Feng, Zheng Wang, Rui Yan, Dongyan Zhao
Entity alignment is the task of linking entities with the same real-world identity from different knowledge graphs (KGs), which has been recently dominated by embedding-based methods.
Ranked #20 on
Entity Alignment
on DBP15k zh-en
(using extra training data)
no code implementations • ACL 2019 • Jiazhan Feng, Chongyang Tao, Wei Wu, Yansong Feng, Dongyan Zhao, Rui Yan
Under the framework, we simultaneously learn two matching models with independent training sets.
no code implementations • NAACL 2019 • Kun Xu, Yuxuan Lai, Yansong Feng, Zhiguo Wang
However, extending KV-MemNNs to Knowledge Based Question Answering (KB-QA) is not trivia, which should properly decompose a complex question into a sequence of queries against the memory, and update the query representations to support multi-hop reasoning over the memory.
1 code implementation • ACL 2019 • Kun Xu, Li-Wei Wang, Mo Yu, Yansong Feng, Yan Song, Zhiguo Wang, Dong Yu
Previous cross-lingual knowledge graph (KG) alignment studies rely on entity embeddings derived only from monolingual KG structural information, which may fail at matching entities that have different facts in two KGs.
1 code implementation • 25 Feb 2019 • Yuxuan Lai, Yansong Feng, Xiaohan Yu, Zheng Wang, Kun Xu, Dongyan Zhao
Short text matching often faces the challenges that there are great word mismatch and expression diversity between the two texts, which would be further aggravated in languages like Chinese where there is no natural space to segment words explicitly.
no code implementations • 9 Nov 2018 • Li-Wei Chen, Yansong Feng, Songfang Huang, Bingfeng Luo, Dongyan Zhao
Relation extraction is the task of identifying predefined relationship between entities, and plays an essential role in information extraction, knowledge base construction, question answering and so on.
no code implementations • 21 Oct 2018 • Qing Qin, Jie Ren, Jialong Yu, Ling Gao, Hai Wang, Jie Zheng, Yansong Feng, Jianbin Fang, Zheng Wang
We experimentally show that how two mainstream compression techniques, data quantization and pruning, perform on these network architectures and the implications of compression techniques to the model storage size, inference time, energy consumption and performance metrics.
2 code implementations • 13 Oct 2018 • Haoxi Zhong, Chaojun Xiao, Zhipeng Guo, Cunchao Tu, Zhiyuan Liu, Maosong Sun, Yansong Feng, Xianpei Han, Zhen Hu, Heng Wang, Jianfeng Xu
In this paper, we give an overview of the Legal Judgment Prediction (LJP) competition at Chinese AI and Law challenge (CAIL2018).
1 code implementation • EMNLP 2018 • Feifan Fan, Yansong Feng, Dongyan Zhao
We propose a fine-grained attention mechanism, which can capture the word-level interaction between aspect and context.
1 code implementation • EMNLP 2018 • Kun Xu, Lingfei Wu, Zhiguo Wang, Yansong Feng, Vadim Sheinin
Previous work approaches the SQL-to-text generation task using vanilla Seq2Seq models, which may not fully capture the inherent graph-structured information in SQL query.
no code implementations • 22 Aug 2018 • Chongyang Tao, Wei Wu, Can Xu, Yansong Feng, Dongyan Zhao, Rui Yan
In this paper, we study context-response matching with pre-trained contextualized representations for multi-turn response selection in retrieval-based chatbots.
3 code implementations • 4 Jul 2018 • Chaojun Xiao, Haoxi Zhong, Zhipeng Guo, Cunchao Tu, Zhiyuan Liu, Maosong Sun, Yansong Feng, Xianpei Han, Zhen Hu, Heng Wang, Jianfeng Xu
In this paper, we introduce the \textbf{C}hinese \textbf{AI} and \textbf{L}aw challenge dataset (CAIL2018), the first large-scale Chinese legal dataset for judgment prediction.
no code implementations • ACL 2018 • Yanyan Jia, Yuan Ye, Yansong Feng, Yuxuan Lai, Rui Yan, Dongyan Zhao
Identifying long-span dependencies between discourse units is crucial to improve discourse parsing performance.
no code implementations • NAACL 2018 • Yao Fu, Yansong Feng
Memory augmented encoder-decoder framework has achieved promising progress for natural language generation tasks.
no code implementations • ACL 2018 • Bingfeng Luo, Yansong Feng, Zheng Wang, Songfang Huang, Rui Yan, Dongyan Zhao
The success of many natural language processing (NLP) tasks is bound by the number and quality of annotated data, but there is often a shortage of such training data.
4 code implementations • ICLR 2019 • Kun Xu, Lingfei Wu, Zhiguo Wang, Yansong Feng, Michael Witbrock, Vadim Sheinin
Our method first generates the node and graph embeddings using an improved graph-based neural network with a novel aggregation strategy to incorporate edge direction information in the node embeddings.
Ranked #1 on
SQL-to-Text
on WikiSQL
no code implementations • 11 Dec 2017 • Ying Zeng, Yansong Feng, Rong Ma, Zheng Wang, Rui Yan, Chongde Shi, Dongyan Zhao
We show that this large volume of training data not only leads to a better event extractor, but also allows us to detect multiple typed events.
no code implementations • EMNLP 2017 • Lili Yao, Yaoyuan Zhang, Yansong Feng, Dongyan Zhao, Rui Yan
The study on human-computer conversation systems is a hot research topic nowadays.
no code implementations • EMNLP 2017 • Bingfeng Luo, Yansong Feng, Jianbo Xu, Xiang Zhang, Dongyan Zhao
The charge prediction task is to determine appropriate charges for a given case, which is helpful for legal assistant systems where the user input is fact description.
no code implementations • ACL 2017 • Zhiliang Tian, Rui Yan, Lili Mou, Yiping Song, Yansong Feng, Dongyan Zhao
Generative conversational systems are attracting increasing attention in natural language processing (NLP).
no code implementations • ACL 2017 • Bingfeng Luo, Yansong Feng, Zheng Wang, Zhanxing Zhu, Songfang Huang, Rui Yan, Dongyan Zhao
We show that the dynamic transition matrix can effectively characterize the noise in the training data built by distant supervision.
no code implementations • 7 Apr 2017 • Yaoyuan Zhang, Zhenxu Ye, Yansong Feng, Dongyan Zhao, Rui Yan
For word-level studies, words are simplified but also have potential grammar errors due to different usages of words before and after simplification.
no code implementations • COLING 2016 • Kun Xu, Yansong Feng, Songfang Huang, Dongyan Zhao
While these systems are able to provide more precise answers than information retrieval (IR) based QA systems, the natural incompleteness of KB inevitably limits the question scope that the system can answer.
1 code implementation • ACL 2016 • Kun Xu, Siva Reddy, Yansong Feng, Songfang Huang, Dongyan Zhao
Existing knowledge-based question answering systems often rely on small annotated training data.
no code implementations • EMNLP 2015 • Kun Xu, Yansong Feng, Songfang Huang, Dongyan Zhao
Syntactic features play an essential role in identifying relationship in a sentence.
Ranked #3 on
Relation Classification
on SemEval 2010 Task 8