NSynth is a dataset of one shot instrumental notes, containing 305,979 musical notes with unique pitch, timbre and envelope. The sounds were collected from 1006 instruments from commercial sample libraries and are annotated based on their source (acoustic, electronic or synthetic), instrument family and sonic qualities. The instrument families used in the annotation are bass, brass, flute, guitar, keyboard, mallet, organ, reed, string, synth lead and vocal. Four second monophonic 16kHz audio snippets were generated (notes) for the instruments.
121 PAPERS • 3 BENCHMARKS
CREMA-D is an emotional multimodal actor data set of 7,442 original clips from 91 actors. These clips were from 48 male and 43 female actors between the ages of 20 and 74 coming from a variety of races and ethnicities (African America, Asian, Caucasian, Hispanic, and Unspecified).
22 PAPERS • 7 BENCHMARKS
DCASE2014 is an audio classification benchmark.
3 PAPERS • NO BENCHMARKS YET