1 code implementation • 18 Sep 2021 • Pei-Chun Chang, Yong-Sheng Chen, Chang-Hsing Lee
First, an input music signal is divided into a number of fixed-duration (3 seconds in this study) music clips, and the raw waveform of each music clip is fed into 1D MS-SincNet filter learning module to obtain three-channel 2D representations.