Learning a Representation for Cover Song Identification Using Convolutional Neural Network

arXiv 2019  ·  Zhesong Yu, Xiaoshuo Xu, Xiaoou Chen, Deshun Yang ·

Cover song identification represents a challenging task in the field of Music Information Retrieval (MIR) due to complex musical variations between query tracks and cover versions. Previous works typically utilize hand-crafted features and alignment algorithms for the task. More recently, further breakthroughs are achieved employing neural network approaches. In this paper, we propose a novel Convolutional Neural Network (CNN) architecture based on the characteristics of the cover song task. We first train the network through classification strategies; the network is then used to extract music representation for cover song identification. A scheme is designed to train robust models against tempo changes. Experimental results show that our approach outperforms state-of-the-art methods on all public datasets, improving the performance especially on the large dataset.

Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Cover song identification Covers80 CQT-Net MAP 0.840 # 5
Cover song identification SHS100K-TEST CQT-Net mAP 0.655 # 4
Cover song identification YouTube350 CQT-Net MAP 0.917 # 3


