A large-scale reference dataset for bioacoustics. MeerKAT is a 1068h large-scale dataset containing data from audio-recording collars worn by free-ranging meerkats (Suricata suricatta) at the Kalahari Research Centre, South Africa, of which 184h are labeled with twelve time-resolved vocalization-type ground truth target classes, each with millisecond resolution. The labeled 184h MeerKAT subset exhibits realistic sparsity conditions for a bioacoustic dataset (96% background-noise or other signals and 4% vocalizations), dispersed across 66398 10-second samples, spanning 251562 labeled events and showcasing significant spectral and temporal variability, making it the first large-scale reference point with real-world conditions for benchmarking pretraining and finetune approaches in bioacoustics deep learning.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


Modalities


Languages