gtzan_music_speech

gtzan_music_speech is a dataset for music/speech discrimination. It consists of 120 tracks of 30 second length. Each class (music/speech) has 60 samples. The tracks are all 22050Hz Mono 16-bit audio files in .wav format.

Papers


Paper Code Results Date Stars

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages