It focuses on preserving the knowledge and experience from the history dialogue between the user and AI assistant, which can be applied to future dialogue for generating a better response.
We achieve a biomedical multilingual corpus by incorporating three granularity knowledge alignments (entity, fact, and passage levels) into monolingual corpora.
In this paper, we propose a retrieval-augmented spelling check framework called RSpell, which searches corresponding domain terms and incorporates them into CSC models.
RA offsets the overfitting risk by introducing a novel positive relation detection task (i. e., learning to distinguish strong and weak positive pairs).
Ranked #2 on Text based Person Retrieval on RSTPReid
Text-based person search (TBPS) aims to retrieve the images of the target person from a large image gallery based on a given natural language description.
Diffusion models have been successfully adapted to text generation tasks by mapping the discrete text into the continuous space.
The proposed framework equipped with only two embedding layers achieves $O(1)$ querying time complexity, while improving the retrieval efficiency and keeping its performance, when applied prior to the common image-text retrieval methods.
Drawn inspiration from prefix-tuning, we are allowed to integrate the task knowledge from text summarization and question answering into a properly designed prefix and apply the merged prefix to query-focused summarization.
We first measure a model's factual robustness by its success rate to defend against adversarial attacks when generating factual information.
Furthermore, based on the similarity between video outlines and textual outlines, we use a large number of articles with chapter headings to pretrain our model.
While the builders of existing image-text retrieval datasets strive to ensure that the caption matches the linked image, they cannot prevent a caption from fitting other images.
With so many articles of varying qualities being produced every moment, it is a very urgent task to screen outstanding articles and commit them to social media.
However, there is a big gap between the real input scenario and automatic generated corpus.
Abstractive summarization for long-document or multi-document remains challenging for the Seq2Seq architecture, as Seq2Seq is not good at analyzing long-distance relations in text.
To sustain engaging conversation, it is critical for chatbots to make good use of relevant knowledge.
Most previous seq2seq summarization systems purely depend on the source text to generate summaries, which tends to work unstably.
Ranked #23 on Text Summarization on GigaWord
While previous abstractive summarization approaches usually focus on the improvement of informativeness, we argue that faithfulness is also a vital prerequisite for a practical abstractive summarization system.
Ranked #22 on Text Summarization on GigaWord
In this paper, we develop a novel Seq2Seq model to fuse a copying decoder and a restricted generative decoder.
Developed so far, multi-document summarization has reached its bottleneck due to the lack of sufficient training data and diverse categories of documents.
Query relevance ranking and sentence saliency ranking are the two main tasks in extractive query-focused summarization.
Both informativeness and readability of the collected summaries are verified by manual judgment.
However, according to our quantitative analysis, none of the existing summarization models can always produce high-quality summaries for different document sets, and even a summarization model with good overall performance may produce low-quality summaries for some document sets.