The MAESTRO dataset contains over 200 hours of paired audio and MIDI recordings from ten years of International Piano-e-Competition. The MIDI data includes key strike velocities and sustain/sostenuto/una corda pedal positions. Audio and MIDI files are aligned with ∼3 ms accuracy and sliced to individual musical pieces, which are annotated with composer, title, and year of performance. Uncompressed audio is of CD quality or higher (44.1–48 kHz 16-bit PCM stereo).
89 PAPERS • NO BENCHMARKS YET
MusicCaps is a dataset composed of 5.5k music-text pairs, with rich text descriptions provided by human experts. For each 10-second music clip, MusicCaps provides:
36 PAPERS • 1 BENCHMARK
The JSB chorales are a set of short, four-voice pieces of music well-noted for their stylistic homogeneity. The chorales were originally composed by Johann Sebastian Bach in the 18th century. He wrote them by first taking pre-existing melodies from contemporary Lutheran hymns and then harmonising them to create the parts for the remaining three voices. The version of the dataset used canonically in representation learning contexts consists of 382 such chorales, with a train/validation/test split of 229, 76 and 77 samples respectively.
32 PAPERS • 1 BENCHMARK
The Lakh Pianoroll Dataset (LPD) is a collection of 174,154 multitrack pianorolls derived from the Lakh MIDI Dataset (LMD).
9 PAPERS • NO BENCHMARKS YET
ASAP is a dataset of 222 digital musical scores aligned with 1068 performances (more than 92 hours) of Western classical piano music.
4 PAPERS • NO BENCHMARKS YET
ComMU has 11,144 MIDI samples that consist of short note sequences created by professional composers with their corresponding 12 metadata. This dataset is designed for a new task, combinatorial music generation which generate diverse and high-quality music only with metadata through auto-regressive language model.
A MIDI dataset of 500 4-part chorales generated by the KS_Chorus algorithm, annotated with results from hundreds of listening test participants, with 500 further unannotated chorales.
3 PAPERS • NO BENCHMARKS YET
The MusicBench dataset is a music audio-text pair dataset that was designed for text-to-music generation purpose and released along with Mustango text-to-music model. MusicBench is based on the MusicCaps dataset, which it expands from 5,521 samples to 52,768 training and 400 test samples!
1 PAPER • 1 BENCHMARK
YM2413-MDB is an 80s FM video game music dataset with multi-label emotion annotations. It includes 669 audio and MIDI files of music from Sega and MSX PC games in the 80s using YM2413, a programmable sound generator based on FM. The collected game music is arranged with a subset of 15 monophonic instruments and one drum instrument.
1 PAPER • NO BENCHMARKS YET