ASR-RAMC-BIGCCSC: A CHINESE CONVERSATIONAL SPEECH CORPUS

Introduced by Yang et al. in Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset

A Rich Annotated Mandarin Conversational (RAMC) Speech Dataset, including 180 hours of Mandarin Chinese dialogue, 150, 10 and 20 hours for the training set, development set and test set respectively. It contains 351 multi-turn dialogues, each of which is a coherent and compact conversation centered around one theme.

It covers 15 topics, including humanities, entertainment, sports, military, finance, religion, family life, politics, education, digital devices, environment, science, professional development, art and ordinary life.

It is suitable for exploring speech processing techniques in dialog scenarios.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


Modalities


Languages