997 Hours – Wuhan Dialect Speech Data by Mobile Phone

Description: Mobile phone captured audio data of Wuhan dialect, 997 hours in total, recorded by more than 2,000 Wuhan dialect native speakers. The recorded text covers generic, interactive, on-board, home and other categories, with rich contents. Wuhan locals participate in quality check and proofreading. Sentence accuracy rate reaches 95 %; this data set can be used for automatic speech recognition, machine translation, and voiceprint recognition.

Format: 16kHz, 16bit, uncompressed wav, mono channel

Recording environments: quiet indoor environment, without echo


