The MAESTRO dataset contains over 200 hours of paired audio and MIDI recordings from ten years of International Piano-e-Competition. The MIDI data includes key strike velocities and sustain/sostenuto/una corda pedal positions. Audio and MIDI files are aligned with ∼3 ms accuracy and sliced to individual musical pieces, which are annotated with composer, title, and year of performance. Uncompressed audio is of CD quality or higher (44.1–48 kHz 16-bit PCM stereo).
106 PAPERS • 1 BENCHMARK
MusicNet is a collection of 330 freely-licensed classical music recordings, together with over 1 million annotated labels indicating the precise time of each note in every recording, the instrument that plays each note, and the note's position in the metrical structure of the composition. The labels are acquired from musical scores aligned to recordings by dynamic time warping. The labels are verified by trained musicians; we estimate a labeling error rate of 4%. We offer the MusicNet labels to the machine learning and music communities as a resource for training models and a common benchmark for comparing results.
41 PAPERS • 1 BENCHMARK
The Lakh MIDI dataset is a collection of 176,581 unique MIDI files, 45,129 of which have been matched and aligned to entries in the Million Song Dataset. Its goal is to facilitate large-scale music information retrieval, both symbolic (using the MIDI files alone) and audio content-based (using information extracted from the MIDI files as annotations for the matched audio files). Around 10% of all MIDI files include timestamped lyrics events with lyrics are often transcribed at the word, syllable or character level.
35 PAPERS • NO BENCHMARKS YET
The Synthesized Lakh (Slakh) Dataset is a dataset for audio source separation that is synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments. This first release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying MIDI files synthesized using a professional-grade sampling engine. The tracks in Slakh2100 are split into training (1500 tracks), validation (375 tracks), and test (225 tracks) subsets, totaling 145 hours of mixtures.
34 PAPERS • 3 BENCHMARKS
The JSB chorales are a set of short, four-voice pieces of music well-noted for their stylistic homogeneity. The chorales were originally composed by Johann Sebastian Bach in the 18th century. He wrote them by first taking pre-existing melodies from contemporary Lutheran hymns and then harmonising them to create the parts for the remaining three voices. The version of the dataset used canonically in representation learning contexts consists of 382 such chorales, with a train/validation/test split of 229, 76 and 77 samples respectively.
32 PAPERS • 1 BENCHMARK
EMOPIA (pronounced ‘yee-mò-pi-uh’) dataset is a shared multi-modal (audio and MIDI) database focusing on perceived emotion in pop piano music, to facilitate research on various tasks related to music emotion. The dataset contains 1,087 music clips from 387 songs and clip-level emotion labels annotated by four dedicated annotators.
22 PAPERS • NO BENCHMARKS YET
VGMIDI is a dataset of piano arrangements of video game soundtracks. It contains 200 MIDI pieces labeled according to emotion and 3,850 unlabeled pieces. Each labeled piece was annotated by 30 human subjects according to the Circumplex (valence-arousal) model of emotion using a custom web tool.
14 PAPERS • NO BENCHMARKS YET
DadaGP is a new symbolic music dataset comprising 26,181 song scores in the GuitarPro format covering 739 musical genres, along with an accompanying tokenized format well-suited for generative sequence models such as the Transformer. The tokenized format is inspired by event-based MIDI encodings, often used in symbolic music generation models. The dataset is released with an encoder/decoder which converts GuitarPro files to tokens and back.
13 PAPERS • NO BENCHMARKS YET
ASAP is a dataset of 222 digital musical scores aligned with 1068 performances (more than 92 hours) of Western classical piano music.
11 PAPERS • 2 BENCHMARKS
GiantMIDI-Piano contains 10,854 unique piano solo pieces composed by 2,786 composers. GiantMIDI-Piano contains 34,504,873 transcribed notes, and contains metadata information of each music piece.
9 PAPERS • NO BENCHMARKS YET
First large-scale symphony generation dataset.
9 PAPERS • 1 BENCHMARK
The ADL Piano MIDI is a dataset of 11,086 piano pieces from different genres. This dataset is based on the Lakh MIDI dataset, which is a collection on 45,129 unique MIDI files that have been matched to entries in the Million Song Dataset. Most pieces in the Lakh MIDI dataset have multiple instruments, so for each file the authors of ADL Piano MIDI dataset extracted only the tracks with instruments from the "Piano Family" (MIDI program numbers 1-8). This process generated a total of 9,021 unique piano MIDI files. Theses 9,021 files were then combined with other approximately 2,065 files scraped from publicly-available sources on the internet. All the files in the final collection were de-duped according to their MD5 checksum.
5 PAPERS • NO BENCHMARKS YET
ATEPP is a dataset of expressive piano performances by virtuoso pianists. The dataset contains 11677 performances (~1000 hours) by 49 pianists and covers 1580 movements by 25 composers. All of the MIDI files in the dataset come from the piano transcription of existing audio recordings of piano performances. Scores in MusicXML format are also available for around half of the tracks. The dataset is organized and aligned by compositions and movements for comparative studies.
The MidiCaps dataset [1] is a large-scale dataset of 168,385 midi music files with descriptive text captions, and a set of extracted musical features.
ComMU has 11,144 MIDI samples that consist of short note sequences created by professional composers with their corresponding 12 metadata. This dataset is designed for a new task, combinatorial music generation which generate diverse and high-quality music only with metadata through auto-regressive language model.
4 PAPERS • NO BENCHMARKS YET
Expanded Groove MIDI dataset (E-GMD) is an automatic drum transcription (ADT) dataset that contains 444 hours of audio from 43 drum kits, making it an order of magnitude larger than similar datasets, and the first with human-performed velocity annotations.
3 PAPERS • NO BENCHMARKS YET
A MIDI dataset of 500 4-part chorales generated by the KS_Chorus algorithm, annotated with results from hundreds of listening test participants, with 500 further unannotated chorales.
NES-VMDB is a dataset containing 98,940 gameplay videos from 389 NES games, each paired with its original soundtrack in symbolic format (MIDI). NES-VMDB is built upon the Nintendo Entertainment System Music Database (NES-MDB), encompassing 5,278 music pieces from 397 NES games.
1 PAPER • NO BENCHMARKS YET
Introduction The Niko Chord Progression Dataset is used in AccoMontage2. It contains 5k+ chord progression pieces, labeled with styles. There are four styles in total: Pop Standard, Pop Complex, Dark and R&B. Some progressions have an 'Unknown' style. Some statistics are provided below.
Click to add a brief description of the dataset (Markdown and LaTeX enabled).
We redistribute a suite of datasets as part of the YourMT3 project. The license for redistribution is attached.
Guitar-TECHS is a comprehensive dataset featuring a variety of guitar techniques, musical excerpts, chords, and scales. These elements are performed by diverse musicians across various recording settings. Guitar-TECHS incorporates recordings from two stereo microphones: an egocentric microphone positioned on the performer’s head and an exocentric microphone placed in front of the performer. It also includes direct input recordings and microphoned amplifier outputs, offering a wide spectrum of audio inputs and recording qualities. All signals and MIDI labels are properly synchronized. Its multi-perspective and multi-modal content makes Guitar-TECHS a valuable resource for advancing data-driven guitar research, and to develop robust guitar listening algorithms.
0 PAPER • NO BENCHMARKS YET