Description: The dataset contains 200 Chinese native speakers, covering main dialect zones. It is recorded in both noisy and quiet environment and more suitable for the actual application scenario for speech recognition. The recordings are commonly used spoken sentences. Texts are transcribed by professional annotators. It can be used for speech recognition and machine translation.
Format: 16kHz, 16bit, uncompressed wav, mono channel
Recording environment: quiet environment, noisy environment
Paper | Code | Results | Date | Stars |
---|