LCCC (Large-scale Cleaned Chinese Conversation corpus)

Introduced by Wang et al. in A Large-Scale Chinese Short-Text Conversation Dataset

Contains a base version (6.8million dialogues) and a large version (12.0 million dialogues).

Source: A Large-Scale Chinese Short-Text Conversation Dataset

Papers


Paper Code Results Date Stars

Dataset Loaders


Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages