Symbolic Music Data Version 1.0

8 Jun 2016  ·  Christian Walder ·

In this document, we introduce a new dataset designed for training machine learning models of symbolic music data. Five datasets are provided, one of which is from a newly collected corpus of 20K midi files. We describe our preprocessing and cleaning pipeline, which includes the exclusion of a number of files based on scores from a previously developed probabilistic machine learning model. We also define training, testing and validation splits for the new dataset, based on a clustering scheme which we also describe. Some simple histograms are included.

PDF Abstract

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here