The Groove MIDI Dataset (GMD) is composed of 13.6 hours of aligned MIDI and (synthesized) audio of human-performed, tempo-aligned expressive drumming. The dataset contains 1,150 MIDI files and over 22,000 measures of drumming.
11 PAPERS • NO BENCHMARKS YET
The Bach Doodle Dataset is composed of 21.6 million harmonizations submitted from the Bach Doodle. The dataset contains both metadata about the composition (such as the country of origin and feedback), as well as a MIDI of the user-entered melody and a MIDI of the generated harmonization. The dataset contains about 6 years of user entered music.
4 PAPERS • NO BENCHMARKS YET
The CAL10K dataset (introduced as Swat10k) contains 10,870 songs that are weakly-labelled using a tag vocabulary of 475 acoustic tags and 153 genre tags. The tags have all been harvested from Pandora’s website and result from song annotations performed by expert musicologists involved with the Music Genome Project.
1 PAPER • NO BENCHMARKS YET