Multi-turn conversational Question Answering (ConvQA) is a practical task that requires the understanding of conversation history, such as previous QA pairs, the passage context, and current question.
Transferring the knowledge to a small model through distillation has raised great interest in recent years.
To enable the chatbot to foresee the dialogue future, we design a beam-search-like roll-out strategy for dialogue future simulation using a typical dialogue generation model and a dialogue selector.
Empirically, we show that (a) the dominant winning ticket can achieve performance that is comparable with that of the full-parameter model, (b) the dominant winning ticket is transferable across different tasks, (c) and the dominant winning ticket has a natural structure within each parameter matrix.
Recent efforts have made great progress to track multiple entities in a procedural text, but usually treat each entity separately and ignore the fact that there are often multiple entities interacting with each other during one process, some of which are even explicitly mentioned.
(2) generate a post including selected products via the MGenNet (Multi-Generator Network).
Hence, in this paper, we introduce a combination of curriculum learning and knowledge distillation for efficient dialogue generation models, where curriculum learning can help knowledge distillation from data and model aspects.
Story generation is a challenging task of automatically creating natural languages to describe a sequence of events, which requires outputting text with not only a consistent topic but also novel wordings.
Contrastive learning has achieved impressive success in generation tasks to militate the "exposure bias" problem and discriminatively exploit the different quality of references.
To enhance the performance of dense retrieval models without loss of efficiency, we propose a GNN-encoder model in which query (passage) information is fused into passage (query) representations via graph neural networks that are constructed by queries and their top retrieved passages.
Grounding dialogue generation by extra knowledge has shown great potentials towards building a system capable of replying with knowledgeable and engaging responses.
Typed entailment graphs try to learn the entailment relations between predicates from text and model them as edges between predicate nodes.
We probe PLMs and models with visual signals, including vision-language pretrained models and image synthesis models, on this benchmark, and find that image synthesis models are more capable of learning accurate and consistent spatial knowledge than other models.
Most of the CQA methods only incorporate articles or Wikipedia to extract knowledge and answer the user's question.
Continual learning (CL) of a sequence of tasks is often accompanied with the catastrophic forgetting(CF) problem.
Considering there is no parallel data between the contexts and the responses of target style S1, existing works mainly use back translation to generate stylized synthetic data for training, where the data about context, target style S1 and an intermediate style S0 is used.
In this paper, we present a new verification style reading comprehension dataset named VGaokao from Chinese Language tests of Gaokao.
To tackle these challenges, we propose a representation[K]-interaction[L]-matching framework that explores multiple types of deep interactive representations to build context-response matching models for response selection.
Hence, in this paper, we propose a Relation-aware Related work Generator (RRG), which generates an abstractive related work from the given multiple scientific papers in the same research area.
Document-level Relation Extraction (RE) is a more challenging task than sentence RE as it often requires reasoning over multiple sentences.
Ranked #44 on Relation Extraction on DocRED
Recent studies strive to incorporate various human rationales into neural networks to improve model performance, but few pay attention to the quality of the rationales.
A thorough empirical analysis shows that MRC models tend to learn shortcut questions earlier than challenging questions, and the high proportions of shortcut questions in training sets hinder models from exploring the sophisticated reasoning skills in the later stage of training.
Sequential information, a. k. a., orders, is assumed to be essential for processing a sequence with recurrent neural network or convolutional neural network based encoders.
Further analysis shows that Lattice-BERT can harness the lattice structures, and the improvement comes from the exploration of redundant information and multi-granularity representations.
To fill the gap between these up-to-date methods and the real-world applications, we incorporate user-specific dialogue history into the response selection and propose a personalized hybrid matching network (PHMN).
Automatically identifying fake news from the Internet is a challenging problem in deception detection tasks.
Hence, in this paper, we propose to improve the response generation performance by examining the model's ability to answer a reading comprehension question, where the question is focused on the omitted information in the dialog.
In this paper, we propose a Disentanglement-based Attractive Headline Generator (DAHG) that generates headline which captures the attractive content following the attractive style.
Understanding neural models is a major topic of interest in the deep learning community.
To generate more meaningful answers, in this paper, we propose a novel generative neural model, called the Meaningful Product Answer Generator (MPAG), which alleviates the safe answer problem by taking product reviews, product attributes, and a prototype answer into consideration.
Hence, in this paper, we propose to recommend an appropriate sticker to user based on multi-turn dialog context and sticker using history of user.
Code comments are vital for software maintenance and comprehension, but many software projects suffer from the lack of meaningful and up-to-date comments in practice.
We study knowledge-grounded dialogue generation with pre-trained language models.
Hence, in this paper, we propose the task of Video-based Multimodal Summarization with Multimodal Output (VMSMO) to tackle such a problem.
To address these issues, in this paper, we propose learning a context-response matching model with auxiliary self-supervised tasks designed for the dialogue data based on pre-trained language models.
Ranked #2 on Conversational Response Selection on E-commerce
In this paper, we propose a novel semantic parser for domain adaptation, where we have much fewer annotated data in the target domain compared to the source domain.
We manually collect a new and high-quality paired dataset, where each pair contains an unordered product attribute set in the source language and an informative product description in the target language.
This paper presents Neighborhood Matching Network (NMN), a novel entity alignment framework for tackling the structural heterogeneity challenge.
Text summarization is the research area aiming at creating a short and condensed version of the original document, which conveys the main idea of the document in a few words.
Stickers with vivid and engaging expressions are becoming increasingly popular in online messaging apps, and some works are dedicated to automatically select sticker response by matching text labels of stickers with previous utterances.
In such a low-resource setting, we devise a disentangled response decoder in order to isolate parameters that depend on knowledge-grounded dialogues from the entire generation model.
However, such models often make predictions for each entity pair individually, thus often fail to solve the inconsistency among different predictions, which can be characterized by discrete relation constraints.
The key idea of the proposed approach is to use a Forward Transformation to transform dense representations to sparse representations.
Information-seeking conversation system aims at satisfying the information needs of users through conversations.
We study how to sample negative examples to automatically construct a training set for effective model learning in retrieval-based dialogue systems.
In this work, we improve the WAE for response generation.
Different from other text generation tasks, in product description generation, it is of vital importance to generate faithful descriptions that stick to the product attribute information.
Previous research on dialogue systems generally focuses on the conversation between two participants, yet multi-party conversations which involve more than two participants within one session bring up a more complicated but realistic scenario.
Sponsored search optimizes revenue and relevance, which is estimated by Revenue Per Mille (RPM).
Existing dialog systems are all monolingual, where features shared among different languages are rarely explored.
Positive-unlabeled (PU) learning learns a binary classifier using only positive and unlabeled examples without labeled negative examples.
Text style transfer task requires the model to transfer a sentence of one style to another style while retaining its original content meaning, which is a challenging problem that has long suffered from the shortage of parallel data.
Entity alignment is a viable means for integrating heterogeneous knowledge among different knowledge graphs (KGs).
Ranked #10 on Entity Alignment on DBP15k zh-en (using extra training data)
There are two main challenges in this task: (1) the model needs to incorporate learned patterns from the prototype, but (2) should avoid copying contents other than the patternized words---such as irrelevant facts---into the generated summaries.
Entity alignment is the task of linking entities with the same real-world identity from different knowledge graphs (KGs), which has been recently dominated by embedding-based methods.
Ranked #12 on Entity Alignment on DBP15k zh-en (using extra training data)
Timeline summarization targets at concisely summarizing the evolution trajectory along the timeline and existing timeline summarization approaches are all based on extractive methods. In this paper, we propose the task of abstractive timeline summarization, which tends to concisely paraphrase the information in the time-stamped events. Unlike traditional document summarization, timeline summarization needs to model the time series information of the input events and summarize important events in chronological order. To tackle this challenge, we propose a memory-based timeline summarization model (MTS). Concretely, we propose a time-event memory to establish a timeline, and use the time position of events on this timeline to guide generation process. Besides, in each decoding step, we incorporate event-level information into word-level attention to avoid confusion between events. Extensive experiments are conducted on a large-scale real-world dataset, and the results show that MTS achieves the state-of-the-art performance in terms of both automatic and human evaluations.
Ranked #1 on Timeline Summarization on MTS
Currently, researchers have paid great attention to retrieval-based dialogues in open-domain.
Ranked #8 on Conversational Response Selection on Douban
Due to its potential applications, open-domain dialogue generation has become popular and achieved remarkable progress in recent years, but sometimes suffers from generic responses.
Under the framework, we simultaneously learn two matching models with independent training sets.
We present a document-grounded matching network (DGMN) for response selection that can power a knowledge-aware retrieval-based chatbot system.
Existing neural models for dialogue response generation assume that utterances are sequentially organized.
A large amount of parallel data is needed to train a strong neural machine translation (NMT) system.
Short text matching often faces the challenges that there are great word mismatch and expression diversity between the two texts, which would be further aggravated in languages like Chinese where there is no natural space to segment words explicitly.
In this paper, we propose the task of product-aware answer generation, which tends to generate an accurate and complete answer from large-scale unlabeled e-commerce reviews and product attributes.
Ranked #1 on Question Answering on JD Product Question Answer
To tackle this problem, we propose the task of reader-aware abstractive summary generation, which utilizes the reader comments to help the model produce better summary about the main aspect.
Ranked #1 on Reader-Aware Summarization on RASG
In this paper, we incorporate the logic information with the help of the Natural Language Inference (NLI) task to the Story Cloze Test (SCT).
To build an open-domain multi-turn conversation system is one of the most interesting and challenging tasks in Artificial Intelligence.
Automatic storytelling is challenging since it requires generating long, coherent natural language to describes a sensible sequence of events.
Relation extraction is the task of identifying predefined relationship between entities, and plays an essential role in information extraction, knowledge base construction, question answering and so on.
We propose a fine-grained attention mechanism, which can capture the word-level interaction between aspect and context.
It is a challenging task to automatically compose poems with not only fluent expressions but also aesthetic wording.
In this paper, we introduce Iterative Text Summarization (ITS), an iteration-based model for supervised extractive text summarization, inspired by the observation that it is often necessary for a human to read an article multiple times in order to fully understand and summarize its contents.
Ranked #13 on Extractive Text Summarization on CNN / Daily Mail
In this paper, we study context-response matching with pre-trained contextualized representations for multi-turn response selection in retrieval-based chatbots.
Identifying long-span dependencies between discourse units is crucial to improve discourse parsing performance.
The success of many natural language processing (NLP) tasks is bound by the number and quality of annotated data, but there is often a shortage of such training data.
Automatic evaluating the performance of Open-domain dialogue system is a challenging problem.
Human-computer conversation systems have attracted much attention in Natural Language Processing.
Traditional recurrent neural network (RNN) or convolutional neural net- work (CNN) based sequence-to-sequence model can not handle tree structural data well.
In the paper, we propose a new question generation problem, which also requires the input of a target topic in addition to a piece of descriptive text.
We show that this large volume of training data not only leads to a better event extractor, but also allows us to detect multiple typed events.
Results show that the proposed content preservation metric is highly correlate to human judgments, and the proposed models are able to generate sentences with higher style transfer strength and similar content preservation score comparing to auto-encoder.
Ranked #4 on Unsupervised Text Style Transfer on Yelp
However, traditional seq2seq suffer from a severe weakness: during beam search decoding, they tend to rank universal replies at the top of the candidate list, resulting in the lack of diversity among candidate replies.
The charge prediction task is to determine appropriate charges for a given case, which is helpful for legal assistant systems where the user input is fact description.
Generative conversational systems are attracting increasing attention in natural language processing (NLP).
We show that the dynamic transition matrix can effectively characterize the noise in the training data built by distant supervision.
For word-level studies, words are simplified but also have potential grammar errors due to different usages of words before and after simplification.
Open-domain human-computer conversation has been attracting increasing attention over the past few years.
While these systems are able to provide more precise answers than information retrieval (IR) based QA systems, the natural incompleteness of KB inevitably limits the question scope that the system can answer.
In this paper, we propose a novel ensemble of retrieval-based and generation-based dialog systems in the open domain.
Existing knowledge-based question answering systems often rely on small annotated training data.
Syntactic features play an essential role in identifying relationship in a sentence.
Ranked #3 on Relation Classification on SemEval 2010 Task 8