The NISQA Corpus includes more than 14,000 speech samples with simulated (e.g. codecs, packet-loss, background noise) and live (e.g. mobile phone, Zoom, Skype, WhatsApp) conditions. Each file is labelled with subjective ratings of the overall quality and the quality dimensions Noisiness, Coloration, Discontinuity, and Loudness. In total, it contains more than 97,000 human ratings for each of the dimensions and the overall MOS.

The NISQA Speech Quality Corpus contains two training, two validation and four test datasets:

  • NISQA_TRAIN_SIM and NISQA_VAL_SIM: contains simulated distortions with speech samples from four different datasets. Divided into a training and a validation set.
  • NISQA_TRAIN_LIVE and NISQA_VAL_LIVE: contains live phone and Skype recordings with Librivox audiobook samples. Divided into training and validation set.
  • NISQA_TEST_LIVETALK: contains recordings of real phone and VoIP calls.
  • NISQA_TEST_FOR: contains live and simulated conditions with speech samples from the forensic speech dataset.
  • NISQA_TEST_NSC: contains live and simulated conditions with speech samples from the NSC dataset.
  • NISQA_TEST_P501: contains live and simulated conditions with speech samples from ITU-T Rec. P.501.

The datasets are provided under the original terms of the used source speech and noise samples. Please see the individual readme and license files in each of the dataset folders within the NISQA_Corpus.zip for more details about the datasets and the licenses. Generally, all of the files in this corpus can be used for non-commercial research purposes and some of the datasets can be also be used for commercial purposes.

Papers


Paper Code Results Date Stars

Dataset Loaders


Tasks


License


  • Various (see readme files)

Modalities


Languages