Question Generation

120 papers with code • 6 benchmarks • 17 datasets

The goal of Question Generation is to generate a valid and fluent question according to a given passage and the target answer. Question Generation can be used in many scenarios, such as automatic tutoring systems, improving the performance of Question Answering models and enabling chatbots to lead a conversation.

Source: Generating Highly Relevant Questions

Greatest papers with code

ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training

huggingface/transformers 13 Jan 2020

This paper presents a new sequence-to-sequence pre-training model called ProphetNet, which introduces a novel self-supervised objective named future n-gram prediction and the proposed n-stream self-attention mechanism.

Ranked #5 on Text Summarization on GigaWord (using extra training data)

Abstractive Text Summarization Question Generation

ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation

PaddlePaddle/ERNIE 26 Jan 2020

Current pre-training works in natural language generation pay little attention to the problem of exposure bias on downstream tasks.

 Ranked #1 on Generative Question Answering on CoQA (using extra training data)

Abstractive Text Summarization Dialogue Generation +3

s2s-ft: Fine-Tuning Pretrained Transformer Encoders for Sequence-to-Sequence Learning

microsoft/unilm 26 Oct 2021

Pretrained bidirectional Transformers, such as BERT, have achieved significant improvements in a wide variety of language understanding tasks, while it is not straightforward to directly apply them for natural language generation.

Abstractive Text Summarization Fine-tuning +3

Synthetic Data Augmentation for Zero-Shot Cross-Lingual Question Answering

microsoft/unilm EMNLP 2021

Coupled with the availability of large scale datasets, deep learning architectures have enabled rapid progress on the Question Answering task.

Cross-Lingual Question Answering Data Augmentation +1

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

microsoft/unilm 28 Feb 2020

We propose to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks using a novel training procedure, referred to as a pseudo-masked language model (PMLM).

Ranked #3 on Question Generation on SQuAD1.1 (using extra training data)

Abstractive Text Summarization Language Modelling +3

Unified Language Model Pre-training for Natural Language Understanding and Generation

microsoft/unilm NeurIPS 2019

This paper presents a new Unified pre-trained Language Model (UniLM) that can be fine-tuned for both natural language understanding and generation tasks.

Ranked #2 on Generative Question Answering on CoQA (using extra training data)

Abstractive Text Summarization Document Summarization +6

Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models

facebookresearch/fairseq 7 Oct 2016

We observe that our method consistently outperforms BS and previously proposed techniques for diverse decoding from neural sequence models.

Image Captioning Machine Translation +3

Learning Dense Representations of Phrases at Scale

princeton-nlp/SimCSE ACL 2021

Open-domain question answering can be reformulated as a phrase retrieval problem, without the need for processing documents on-demand during inference (Seo et al., 2019).

Fine-tuning Open-Domain Question Answering +4

Data Augmentation with Hierarchical SQL-to-Question Generation for Cross-domain Text-to-SQL Parsing

PaddlePaddle/Research EMNLP 2021

For better distribution matching, we require that at least 80% of SQL patterns in the training data are covered by generated queries.

Data Augmentation Question Generation +2

PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation

alibaba/AliceMind 14 Apr 2020

An extensive set of experiments show that PALM achieves new state-of-the-art results on a variety of language generation benchmarks covering generative question answering (Rank 1 on the official MARCO leaderboard), abstractive summarization on CNN/DailyMail as well as Gigaword, question generation on SQuAD, and conversational response generation on Cornell Movie Dialogues.

Abstractive Text Summarization Conversational Response Generation +8