Manually labelled dataset of bird recordings from the species of interest inhabiting in the wetlands of the "Aiguamolls del Empord`{a}" natural park in Girona, Spain. The dataset includes 5,795 annotated audio clips generated from a source of 1,098 recordings retrieved from the Xeno-Canto portal, adding up to a total of 201.6 minutes (12,096 seconds) of vocalizations of different lengths, alongside with their corresponding annotations.
And second, we also share the Mel spectrogram version of the dataset, where each image corresponds to 1-second window of the original audio, resulting in a total of 17,536 spectrogram images stored in matrix form in .npy files.
Paper | Code | Results | Date | Stars |
---|