Visualizing and Understanding Self-attention based Music Tagging

11 Nov 2019  ·  Minz Won, Sanghyuk Chun, Xavier Serra ·

Recently, we proposed a self-attention based music tagging model. Different from most of the conventional deep architectures in music information retrieval, which use stacked 3x3 filters by treating music spectrograms as images, the proposed self-attention based model attempted to regard music as a temporal sequence of individual audio events. Not only the performance, but it could also facilitate better interpretability. In this paper, we mainly focus on visualizing and understanding the proposed self-attention based music tagging model.

PDF Abstract
No code implementations yet. Submit your code now

Categories


Sound Audio and Speech Processing

Datasets


  Add Datasets introduced or used in this paper