35 papers with code • 2 benchmarks • 11 datasets
We present iNLTK, an open-source NLP library consisting of pre-trained language models and out-of-the-box support for Data Augmentation, Textual Similarity, Sentence Embeddings, Word Embeddings, Tokenization and Text Generation in 13 Indic Languages.
To the best of our knowledge, this work is the first to explore deep learning models for paraphrase generation.
Modern NLP defines the task of style transfer as modifying the style of a given sentence without appreciably changing its semantics, which implies that the outputs of style transfer systems should be paraphrases of their inputs.
Query and Output: Generating Words by Querying Distributed Word Representations for Paraphrase Generation
The existing sequence-to-sequence model tends to memorize the words and the patterns in the training dataset instead of learning the meaning of the words.
Generating paraphrases from given sentences involves decoding words step by step from a large vocabulary.
Paraphrasing natural language sentences is a multifaceted process: it might involve replacing individual words or short phrases, local rearrangement of content, or high-level restructuring like topicalization or passivization.
Paraphrase Generation as Zero-Shot Multilingual Translation: Disentangling Semantic Similarity from Lexical and Syntactic Diversity
Recent work has shown that a multilingual neural machine translation (NMT) model can be used to judge how well a sentence paraphrases another sentence in the same language (Thompson and Post, 2020); however, attempting to generate paraphrases from such a model using standard beam search produces trivial copies or near copies.
One way to ensure this is by adding constraints for true paraphrase embeddings to be close and unrelated paraphrase candidate sentence embeddings to be far.