The CANDOR corpus is a large, novel, multimodal corpus of 1,656 recorded conversations in spoken English. This 7+ million word, 850 hour corpus totals over 1TB of audio, video, and transcripts, with moment-to-moment measures of vocal, facial, and semantic expression, along with an extensive survey of speaker post conversation reflections.
1 PAPER • NO BENCHMARKS YET
The Helsinki Prosody Corpus is a dataset for predicting prosodic prominence from written text. The prosodic annotations are automatically generated, high quality prosodic for the 'clean' subsets of LibriTTS corpus (Zen et al., 2019), comprising of 262.5 hours of read speech from 1230 speakers. The transcribed sentences were aligned and then prosodically annotated with word-level acoustic prominence labels.
1 PAPER • 1 BENCHMARK