The complementarity of a diverse range of deep learning features extracted from video content for video recommendation

21 Nov 2020  ·  Adolfo Almeida, Johan Pieter de Villiers, Allan De Freitas, Mergandran Velayudan ·

Following the popularisation of media streaming, a number of video streaming services are continuously buying new video content to mine the potential profit from them. As such, the newly added content has to be handled well to be recommended to suitable users. In this paper, we address the new item cold-start problem by exploring the potential of various deep learning features to provide video recommendations. The deep learning features investigated include features that capture the visual-appearance, audio and motion information from video content. We also explore different fusion methods to evaluate how well these feature modalities can be combined to fully exploit the complementary information captured by them. Experiments on a real-world video dataset for movie recommendations show that deep learning features outperform hand-crafted features. In particular, recommendations generated with deep learning audio features and action-centric deep learning features are superior to MFCC and state-of-the-art iDT features. In addition, the combination of various deep learning features with hand-crafted features and textual metadata yields significant improvement in recommendations compared to combining only the former.

PDF Abstract


Results from the Paper

Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Recommendation Systems MovieLens 10M scaled-CER MAP@5 0.1536 # 1
NDCG@5 0.1846 # 1
MAP@15 0.1568 # 1
NDCG@15 0.2546 # 1
MAP@30 0.1671 # 1
NDCG@30 0.2971 # 1
Recommendation Systems (Item cold-start) MovieLens 10M scaled-CER MAP@5 0.0347 # 1
MAP@15 0.035 # 1
MAP@30 0.0379 # 1


No methods listed for this paper. Add relevant methods here