PeerSum is a new MDS dataset using peer reviews of scientific publications. The dataset differs from the existing MDS datasets in that summaries (i.e., the meta-reviews) are highly abstractive and they are real summaries of the source documents.
In PeerSum, we have reviews (with scores), comments and responses as the source documents and the meta-review (with an acceptance outcome) as the ground truth summary. Each sample of this dataset contains a summary, corresponding source documents and also other complementary information (e.g., review scores) for one paper. The second version of PeerSum (peersum_v2) has 16,308 samples, while there are 10,862 samples in the first version.
The dataset is stored in the json format. For each sample, details are based on following keys with explanation:
For each review (i.e., official review, public comment, or author/reviewer response): * review_id: unique id of each review * writer: official_reviewer, public, author * content: (rating, confidence, comment) * replyto: connect to a review (review_id and replyto are for the conversation structure)