DIRHA-English is a multi-microphone database composed of real and simulated sequences of 1-minute. The overall corpus is composed of different types of sequences including: 1) Phonetically-rich sentences; 2) WSJ 5-k utterances; 3) WSJ 20-k utterances; 4) Conversational speech (also including keywords and commands). The sequences are available for both UK and US English at 48 kHz. The DIRHA-English dataset offers the possibility to work with a very large number of microphone channels, to use of microphone arrays having different characteristics and to work considering different speech recognition tasks (e.g., phone-loop, keyword spotting, ASR with small and very large language models).
19 PAPERS • 1 BENCHMARK