no code implementations • EMNLP 2021 • Jie Hao, Linfeng Song, LiWei Wang, Kun Xu, Zhaopeng Tu, Dong Yu
The task of dialogue rewriting aims to reconstruct the latest dialogue utterance by copying the missing content from the dialogue context.
no code implementations • INLG (ACL) 2020 • Hongyu Gong, Linfeng Song, Suma Bhat
Text style transfer aims to change an input sentence to an output sentence by changing its text style while preserving the content.
no code implementations • EMNLP 2021 • Lifeng Jin, Linfeng Song, Kun Xu, Dong Yu
In order to alleviate the huge demand for annotated datasets for different tasks, many recent natural language processing datasets have adopted automated pipelines for fast-tracking usable data.
no code implementations • ACL 2022 • Irene Li, Linfeng Song, Kun Xu, Dong Yu
Coreference resolution over semantic graphs like AMRs aims to group the graph nodes that represent the same entity.
no code implementations • 30 Dec 2024 • Xingyu Chen, Jiahao Xu, Tian Liang, Zhiwei He, Jianhui Pang, Dian Yu, Linfeng Song, Qiuzhi Liu, Mengfei Zhou, Zhuosheng Zhang, Rui Wang, Zhaopeng Tu, Haitao Mi, Dong Yu
The remarkable performance of models like the OpenAI o1 can be attributed to their ability to emulate human-like long-time thinking during inference.
no code implementations • 30 Dec 2024 • Yang Li, Dong Du, Linfeng Song, Chen Li, Weikang Wang, Tao Yang, Haitao Mi
We introduce HunyuanProver, an language model finetuned from the Hunyuan 7B for interactive automatic theorem proving with LEAN4.
no code implementations • 22 Dec 2024 • Dian Yu, Yuheng Zhang, Jiahao Xu, Tian Liang, Linfeng Song, Zhaopeng Tu, Haitao Mi, Dong Yu
We propose CaP, a novel approach that uses external tools to refine chain-of-thought (CoT) responses generated by the same or other LLMs.
no code implementations • 9 Oct 2024 • Xiyao Wang, Linfeng Song, Ye Tian, Dian Yu, Baolin Peng, Haitao Mi, Furong Huang, Dong Yu
Monte Carlo Tree Search (MCTS) has recently emerged as a powerful technique for enhancing the reasoning capabilities of LLMs.
1 code implementation • 29 Sep 2024 • Ante Wang, Linfeng Song, Zijun Min, Ge Xu, Xiaoli Wang, Junfeng Yao, Jinsong Su
We carefully analyze the negative effects of this phenomenon on pretrained Seq2seq query producers and then propose effective instance-level weighting strategies for training to mitigate these issues from multiple perspectives.
no code implementations • 28 Aug 2024 • Dian Yu, Baolin Peng, Ye Tian, Linfeng Song, Haitao Mi, Dong Yu
There is a growing trend of teaching large language models (LLMs) to solve mathematical problems through coding.
no code implementations • 30 Jun 2024 • Yuheng Zhang, Dian Yu, Baolin Peng, Linfeng Song, Ye Tian, Mingyue Huo, Nan Jiang, Haitao Mi, Dong Yu
Specifically, we formulate the problem as a two-player game and propose a novel online algorithm, iterative Nash policy optimization (INPO).
no code implementations • 29 Jun 2024 • Ante Wang, Linfeng Song, Ye Tian, Baolin Peng, Dian Yu, Haitao Mi, Jinsong Su, Dong Yu
Recent research suggests that tree search algorithms (e. g. Monte Carlo Tree Search) can dramatically boost LLM performance on complex mathematical reasoning tasks.
1 code implementation • 18 Apr 2024 • Ye Tian, Baolin Peng, Linfeng Song, Lifeng Jin, Dian Yu, Haitao Mi, Dong Yu
Despite the impressive capabilities of Large Language Models (LLMs) on various tasks, they still struggle with scenarios that involves complex reasoning and planning.
Ranked #1 on GSM8K on GSM8K
no code implementations • 14 Apr 2024 • Souvik Das, Lifeng Jin, Linfeng Song, Haitao Mi, Baolin Peng, Dong Yu
Current state-of-the-art approaches refine decoding by contrasting early-exit distributions from a lower layer with the final layer to exploit information related to factuality within the model forward procedure.
no code implementations • 14 Mar 2024 • Ante Wang, Linfeng Song, Ye Tian, Baolin Peng, Lifeng Jin, Haitao Mi, Jinsong Su, Dong Yu
Calibration, which establishes the correlation between accuracy and model confidence, is important for LLM development.
1 code implementation • 6 Mar 2024 • Xiangci Li, Linfeng Song, Lifeng Jin, Haitao Mi, Jessica Ouyang, Dong Yu
In this paper, we present a high-quality benchmark named multi-source Wizard of Wikipedia (Ms. WoW) for evaluating multi-source dialogue knowledge selection and response generation.
1 code implementation • 2 Mar 2024 • Jianheng Huang, Leyang Cui, Ante Wang, Chengyi Yang, Xinting Liao, Linfeng Song, Junfeng Yao, Jinsong Su
When conducting continual learning based on a publicly-released LLM checkpoint, the availability of the original training data may be non-existent.
no code implementations • 28 Feb 2024 • Lifeng Jin, Baolin Peng, Linfeng Song, Haitao Mi, Ye Tian, Dong Yu
The most common training pipeline for large language models includes pretraining, finetuning and aligning phases, with their respective resulting models, such as the pretrained model and the finetuned model.
no code implementations • 23 Feb 2024 • Ante Wang, Linfeng Song, Baolin Peng, Ye Tian, Lifeng Jin, Haitao Mi, Jinsong Su, Dong Yu
Experiments on Biographies show that our method can effectively improve the factuality of generations with simple and intuitive prompts across different scales of LLMs.
no code implementations • 14 Feb 2024 • Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Lifeng Jin, Linfeng Song, Haitao Mi, Helen Meng
Despite showing increasingly human-like abilities, large language models (LLMs) often struggle with factual inaccuracies, i. e. "hallucinations", even when they hold relevant knowledge.
1 code implementation • 18 Jan 2024 • Mian Zhang, Lifeng Jin, Linfeng Song, Haitao Mi, Dong Yu
One critical issue for chat systems is to stay consistent about preferences, opinions, beliefs and facts of itself, which has been shown a difficult problem.
1 code implementation • 20 Dec 2023 • Jianheng Huang, Ante Wang, Linfeng Gao, Linfeng Song, Jinsong Su
Based on the observation that the search query is typically related to the topic of dialogue response, we train a response-augmented query producer (RA) to provide rich and effective training signals for QP.
1 code implementation • 28 Sep 2023 • Lingfeng Shen, Sihao Chen, Linfeng Song, Lifeng Jin, Baolin Peng, Haitao Mi, Daniel Khashabi, Dong Yu
We propose Contrast Instructions -- a benchmarking strategy for the consistency of RM.
no code implementations • 18 Sep 2023 • Baolin Peng, Linfeng Song, Ye Tian, Lifeng Jin, Haitao Mi, Dong Yu
Large Language Models (LLMs) have revolutionized natural language processing, yet aligning these models with human values and preferences using RLHF remains a significant challenge.
no code implementations • 14 Aug 2023 • Xiao Lin, Xiaokai Chen, Chenyang Wang, Hantao Shu, Linfeng Song, Biao Li, Peng Jiang
To overcome these challenges, we propose a novel Discrete Conditional Diffusion Reranking (DCDR) framework for recommendation.
no code implementations • 6 Jun 2023 • Xiao Lin, Xiaokai Chen, Linfeng Song, Jingwei Liu, Biao Li, Peng Jiang
An accurate prediction of watch time has been of vital importance to enhance user engagement in video recommender systems.
no code implementations • 17 May 2023 • Longyue Wang, Siyou Liu, Mingzhou Xu, Linfeng Song, Shuming Shi, Zhaopeng Tu
Zero pronouns (ZPs) are frequently omitted in pro-drop languages (e. g. Chinese, Hungarian, and Hindi), but should be recalled in non-pro-drop languages (e. g. English).
1 code implementation • 16 Feb 2023 • Ante Wang, Linfeng Song, Qi Liu, Haitao Mi, Longyue Wang, Zhaopeng Tu, Jinsong Su, Dong Yu
We propose a dialogue model that can access the vast and dynamic information from any search engine for response generation.
no code implementations • 31 Jan 2023 • Mian Zhang, Lifeng Jin, Linfeng Song, Haitao Mi, Xiabing Zhou, Dong Yu
Current self-training methods such as standard self-training, co-training, tri-training, and others often focus on improving model performance on a single task, utilizing differences in input features, model architectures, and training processes.
no code implementations • 11 Nov 2022 • Xiaoyue Wang, Linfeng Song, Xin Liu, Chulun Zhou, Jinsong Su
Simile recognition involves two subtasks: simile sentence classification that discriminates whether a sentence contains simile, and simile component extraction that locates the corresponding objects (i. e., tenors and vehicles).
no code implementations • 8 Nov 2022 • Wenyue Hua, Lifeng Jin, Linfeng Song, Haitao Mi, Yongfeng Zhang, Dong Yu
Pretrained natural language processing (NLP) models have achieved high overall performance, but they still make systematic errors.
1 code implementation • 22 Oct 2022 • Xuefeng Bai, Seng Yang, Leyang Cui, Linfeng Song, Yue Zhang
Based on our observation, we investigate two approaches to reduce the domain distribution divergence of text and AMR features, respectively.
1 code implementation • 22 Oct 2022 • Songyang Zhang, Linfeng Song, Lifeng Jin, Haitao Mi, Kun Xu, Dong Yu, Jiebo Luo
While previous work focuses on building systems for inducing grammars on text that are well-aligned with video content, we investigate the scenario, in which text and video are only in loose correspondence.
1 code implementation • 21 Oct 2022 • Qi Liu, Zihuiwen Ye, Tao Yu, Phil Blunsom, Linfeng Song
We first design a SQL-to-text model conditioned on a sampled goal query, which represents a user's intent, that then converses with a text-to-SQL semantic parser to generate new interactions.
1 code implementation • COLING 2022 • Xuefeng Bai, Linfeng Song, Yue Zhang
However, these models are typically trained on surface dialogue text, thus are proven to be weak in understanding the main semantic meaning of a dialogue context.
1 code implementation • 22 Jun 2022 • Lisa Jin, Linfeng Song, Lifeng Jin, Dong Yu, Daniel Gildea
HCT (i) tags the source string with token-level edit actions and slotted rules and (ii) fills in the resulting rule slots with spans from the dialogue context.
1 code implementation • 1 Jun 2022 • Dingmin Wang, Shengchao Liu, Hanchen Wang, Bernardo Cuenca Grau, Linfeng Song, Jian Tang, Song Le, Qi Liu
Graph Neural Networks (GNNs) are effective tools for graph representation learning.
no code implementations • 27 Apr 2022 • Lifeng Jin, Kun Xu, Linfeng Song, Dong Yu
Approaches for the stance classification task, an important task for understanding argumentation in debates and detecting fake news, have been relying on models which deal with individual debate topics.
1 code implementation • ACL 2021 • Qiankun Fu, Linfeng Song, Wenyu Du, Yue Zhang
Although parsing to Abstract Meaning Representation (AMR) has become very popular and AMR has been shown effective on the many sentence-level downstream tasks, little work has studied how to generate AMRs that can represent multi-sentence information.
no code implementations • ACL 2021 • Lemao Liu, Haisong Zhang, Haiyun Jiang, Yangming Li, Enbo Zhao, Kun Xu, Linfeng Song, Suncong Zheng, Botong Zhou, Dick Zhu, Xiao Feng, Tao Chen, Tao Yang, Dong Yu, Feng Zhang, Zhanhui Kang, Shuming Shi
This paper introduces TexSmart, a text understanding system that supports fine-grained named entity recognition (NER) and enhanced semantic analysis functionalities.
1 code implementation • Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence 2021 • An-Hui Wang, Linfeng Song, Hui Jiang, Shaopeng Lai, Junfeng Yao, Min Zhang, Jinsong Su
Conversational discourse structures aim to describe how a dialogue is organised, thus they are helpful for dialogue understanding and response generation.
Ranked #3 on Discourse Parsing on STAC
1 code implementation • Findings (ACL) 2021 • Pei Ke, Haozhe Ji, Yu Ran, Xin Cui, LiWei Wang, Linfeng Song, Xiaoyan Zhu, Minlie Huang
Existing pre-trained models for knowledge-graph-to-text (KG-to-text) generation simply fine-tune text-to-text pre-trained models such as BART or T5 on KG-to-text datasets, which largely ignore the graph structure during encoding and lack elaborate pre-training tasks to explicitly model graph-text alignments.
Ranked #1 on KG-to-Text Generation on WebQuestions
no code implementations • ACL 2021 • Han Wu, Kun Xu, Linfeng Song, Lifeng Jin, Haisong Zhang, Linqi Song
Language models like BERT and SpanBERT pretrained on open-domain data have obtained impressive gains on various NLP tasks.
1 code implementation • ACL 2021 • Xuefeng Bai, Yulong Chen, Linfeng Song, Yue Zhang
Although neural models have achieved competitive results in dialogue systems, they have shown limited ability in representing core semantics, such as ignoring important entities.
Ranked #8 on Dialog Relation Extraction on DialogRE
Abstract Meaning Representation Dialog Relation Extraction +3
no code implementations • 11 Apr 2021 • Kun Xu, Han Wu, Linfeng Song, Haisong Zhang, Linqi Song, Dong Yu
Semantic role labeling (SRL) aims to extract the arguments for each predicate in an input sentence.
1 code implementation • NAACL 2021 • Songyang Zhang, Linfeng Song, Lifeng Jin, Kun Xu, Dong Yu, Jiebo Luo
We investigate video-aided grammar induction, which learns a constituency parser from both unlabeled text and its corresponding video.
1 code implementation • 5 Mar 2021 • Jinsong Su, Jialong Tang, Hui Jiang, Ziyao Lu, Yubin Ge, Linfeng Song, Deyi Xiong, Le Sun, Jiebo Luo
In aspect-based sentiment analysis (ABSA), many neural models are equipped with an attention mechanism to quantify the contribution of each context word to sentiment prediction.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA)
1 code implementation • ACL 2020 • Linfeng Song, Ante Wang, Jinsong Su, Yue Zhang, Kun Xu, Yubin Ge, Dong Yu
The task of graph-to-text generation aims at producing sentences that preserve the meaning of input graphs.
Ranked #10 on Data-to-Text Generation on WebNLG
no code implementations • 31 Dec 2020 • Haisong Zhang, Lemao Liu, Haiyun Jiang, Yangming Li, Enbo Zhao, Kun Xu, Linfeng Song, Suncong Zheng, Botong Zhou, Jianchen Zhu, Xiao Feng, Tao Chen, Tao Yang, Dong Yu, Feng Zhang, Zhanhui Kang, Shuming Shi
This technique report introduces TexSmart, a text understanding system that supports fine-grained named entity recognition (NER) and enhanced semantic analysis functionalities.
1 code implementation • 29 Dec 2020 • Jie Hao, Linfeng Song, LiWei Wang, Kun Xu, Zhaopeng Tu, Dong Yu
The task of dialogue rewriting aims to reconstruct the latest dialogue utterance by copying the missing content from the dialogue context.
1 code implementation • EMNLP 2020 • Xuefeng Bai, Linfeng Song, Yue Zhang
AMR-to-text generation aims to recover a text containing the same meaning as an input AMR graph.
Ranked #9 on AMR-to-Text Generation on LDC2017T10
no code implementations • EMNLP 2020 • Kun Xu, Haochen Tan, Linfeng Song, Han Wu, Haisong Zhang, Linqi Song, Dong Yu
For multi-turn dialogue rewriting, the capacity of effectively modeling the linguistic knowledge in dialog context and getting rid of the noises is essential to improve its performance.
no code implementations • ACL 2020 • Linfeng Song, Kun Xu, Yue Zhang, Jianshu Chen, Dong Yu
Zero pronoun recovery and resolution aim at recovering the dropped pronoun and pointing out its anaphoric mentions, respectively.
no code implementations • 23 Jan 2020 • Kun Xu, Linfeng Song, Yansong Feng, Yan Song, Dong Yu
Existing entity alignment methods mainly vary on the choices of encoding the knowledge graph, but they typically use the same decoding method, which independently chooses the local optimal match for each source entity.
1 code implementation • 19 Dec 2019 • Jiali Zeng, Linfeng Song, Jinsong Su, Jun Xie, Wei Song, Jiebo Luo
Simile recognition is to detect simile sentences and to extract simile components, i. e., tenors and vehicles.
1 code implementation • 16 Dec 2019 • Yongjing Yin, Linfeng Song, Jinsong Su, Jiali Zeng, Chulun Zhou, Jiebo Luo
Sentence ordering is to restore the original paragraph from a set of sentences.
1 code implementation • IJCNLP 2019 • Linfeng Song, Yue Zhang, Daniel Gildea, Mo Yu, Zhiguo Wang, Jinsong Su
Medical relation extraction discovers relations between entity mentions in text, such as research articles.
no code implementations • WS 2019 • Zhiguo Wang, Yue Zhang, Mo Yu, Wei zhang, Lin Pan, Linfeng Song, Kun Xu, Yousef El-Kurdi
Self-explaining text categorization requires a classifier to make a prediction along with supporting evidence.
1 code implementation • 13 Jul 2019 • Linfeng Song
How to properly model graphs is a long-existing and important problem in NLP area, where several popular types of graphs are knowledge graphs, semantic graphs and dependency graphs.
1 code implementation • 20 Jun 2019 • Mengge Xue, Weiming Cai, Jinsong Su, Linfeng Song, Yubin Ge, Yubao Liu, Bin Wang
However, most neural collective EL methods depend entirely upon neural networks to automatically model the semantic dependencies between different EL decisions, which lack of the guidance from external knowledge.
1 code implementation • ACL 2019 • Jialong Tang, Ziyao Lu, Jinsong Su, Yubin Ge, Linfeng Song, Le Sun, Jiebo Luo
In aspect-level sentiment classification (ASC), it is prevalent to equip dominant neural models with attention mechanisms, for the sake of acquiring the importance of each context word on the given aspect.
Aspect-Based Sentiment Analysis (ABSA) Sentiment Classification
2 code implementations • ACL 2019 • Linfeng Song, Daniel Gildea
Evaluating AMR parsing accuracy involves comparing pairs of AMR graphs.
Ranked #3 on Graph Matching on RARE
1 code implementation • TACL 2019 • Linfeng Song, Daniel Gildea, Yue Zhang, Zhiguo Wang, Jinsong Su
It is intuitive that semantic representations can be useful for machine translation, mainly because they can help in enforcing meaning preservation and handling data sparsity (many sentences correspond to one meaning) of machine translation models.
no code implementations • WS 2018 • Linfeng Song, Yue Zhang, Daniel Gildea
The task of linearization is to find a grammatical order given a set of words.
no code implementations • EMNLP 2018 • Linfeng Song, Yue Zhang, Zhiguo Wang, Daniel Gildea
Cross-sentence $n$-ary relation extraction detects relations among $n$ entities across multiple sentences.
no code implementations • 6 Sep 2018 • Linfeng Song, Zhiguo Wang, Mo Yu, Yue Zhang, Radu Florian, Daniel Gildea
Multi-hop reading comprehension focuses on one type of factoid question, where a system needs to properly integrate multiple pieces of evidence to correctly answer a question.
Ranked #2 on Question Answering on COMPLEXQUESTIONS
2 code implementations • 28 Aug 2018 • Linfeng Song, Yue Zhang, Zhiguo Wang, Daniel Gildea
Cross-sentence $n$-ary relation extraction detects relations among $n$ entities across multiple sentences.
1 code implementation • ACL 2018 • Xiaochang Peng, Linfeng Song, Daniel Gildea, Giorgio Satta
In this paper, we present a sequence-to-sequence based approach for mapping natural language sentences to AMR semantic graphs.
1 code implementation • NAACL 2018 • Linfeng Song, Zhiguo Wang, Wael Hamza, Yue Zhang, Daniel Gildea
The task of natural question generation is to generate a corresponding question given the input passage (fact) and answer.
Ranked #11 on Question Generation on SQuAD1.1
1 code implementation • ACL 2018 • Linfeng Song, Yue Zhang, Zhiguo Wang, Daniel Gildea
The problem of AMR-to-text generation is to recover a text representing the same meaning as an input AMR graph.
Ranked #1 on Graph-to-Sequence on LDC2015E86: (using extra training data)
2 code implementations • ACL 2018 • Yue Zhang, Qi Liu, Linfeng Song
Bi-directional LSTMs are a powerful tool for text representation.
Ranked #10 on Part-Of-Speech Tagging on Penn Treebank
no code implementations • 4 Sep 2017 • Linfeng Song, Zhiguo Wang, Wael Hamza
In the QG task, a question is generated from the system given the passage and the target answer, whereas in the QA task, the answer is generated given the question and the passage.
no code implementations • 25 Aug 2017 • Zhiguo Wang, Wael Hamza, Linfeng Song
However, it lacks the capacity of utilizing instance-level information from individual instances in the training set.
no code implementations • ACL 2017 • Linfeng Song, Xiaochang Peng, Yue Zhang, Zhiguo Wang, Daniel Gildea
This paper addresses the task of AMR-to-text generation by leveraging synchronous node replacement grammar.
no code implementations • 12 Oct 2016 • Linfeng Song, Lin Zhao
Question generation from a knowledge base (KB) is the task of generating questions related to the domain of the input KB.
no code implementations • EMNLP 2016 • Linfeng Song, Yue Zhang, Xiaochang Peng, Zhiguo Wang, Daniel Gildea
The task of AMR-to-text generation is to generate grammatical text that sustains the semantic meaning for a given AMR graph.
no code implementations • SEMEVAL 2016 • Linfeng Song, Zhiguo Wang, Haitao Mi, Daniel Gildea
In the training stage, our method induces several sense centroids (embedding) for each polysemous word.
Ranked #4 on Word Sense Induction on SemEval 2010 WSI