CCPM (Chinese Classical Poetry Matching)

Introduced by Li et al. in CCPM: A Chinese Classical Poetry Matching Dataset

Introduction

CCPM is a large Chinese classical poetry matching dataset that can be used for poetry matching, understanding and translation.

The main task of this dataset is: given a description in modern Chinese, the model is supposed to select one line of Chinese classical poetry from four candidates that semantically match the given description most.

Size

It contains 27,218 instances in total, which are split into training (21,778), validation (2,720) and test (2,720) sets.

Format

Each instance is composed of translation (the description in modern Chinese, a string), choice (four candidate lines of Chinese classical poetry, a list) and answer (the index of the correct line, an integer between 0 and 3).

Source: https://github.com/THUNLP-AIPoet/CCPM

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages