Dataset > Modality > Videos > YouTube-8M


Introduced by Abu-El-Haija et al. in YouTube-8M: A Large-Scale Video Classification Benchmark

The YouTube-8M dataset is a large scale video dataset, which includes more than 7 million videos with 4716 classes labeled by the annotation system. The dataset consists of three parts: training set, validate set, and test set. In the training set, each class contains at least 100 training videos. Features of these videos are extracted by the state-of-the-art popular pre-trained models and released for public use. Each video contains audio and visual modality. Based on the visual information, videos are divided into 24 topics, such as sports, game, arts & entertainment, etc

Source: Audio-Visual Embedding for Cross-Modal Music Video Retrieval through Supervised Deep CCA