ACL Title and Abstract Dataset

Introduced by Wang et al. in Paper Abstract Writing through Editing Mechanism

This dataset gathers 10,874 title and abstract pairs from the ACL Anthology Network (until 2016).

The structure of the data is as follows: - title - abstract - \newline

This dataset is used in our published paper: Paper Abstract Writing through Editing Mechanism

Citation

@inproceedings{wang-etal-2018-paper,
    title = "Paper Abstract Writing through Editing Mechanism",
    author = "Wang, Qingyun  and
      Zhou, Zhihao  and
      Huang, Lifu  and
      Whitehead, Spencer  and
      Zhang, Boliang  and
      Ji, Heng  and
      Knight, Kevin",
    booktitle = "Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)",
    month = jul,
    year = "2018",
    address = "Melbourne, Australia",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/P18-2042",
    doi = "10.18653/v1/P18-2042",
    pages = "260--265",
    abstract = "We present a paper abstract writing system based on an attentive neural sequence-to-sequence model that can take a title as input and automatically generate an abstract. We design a novel Writing-editing Network that can attend to both the title and the previously generated abstract drafts and then iteratively revise and polish the abstract. With two series of Turing tests, where the human judges are asked to distinguish the system-generated abstracts from human-written ones, our system passes Turing tests by junior domain experts at a rate up to 30{\%} and by non-expert at a rate up to 80{\%}.",
}