VGG-SS (VGG-Sound Source)

Introduced by Chen et al. in Localizing Visual Sounds the Hard Way

VGG-SS (VGG Sound Source) is a benchmark for evaluating sound source localisation in videos. The dataset consists on a new set of annotations for the recently-introduced VGG-Sound dataset, where the sound sources visible in each video clip are explicitly marked with bounding box annotations. This dataset is 20 times larger than analogous existing ones, contains 5K videos spanning over 200 categories, and, differently from Flickr SoundNet, is video-based.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages