ARSC-Net: Adventitious Respiratory Sound Classification Network Using Parallel Paths with Channel-Spatial Attention

Automatic identification of adventitious respiratory sound has still been a challenging problem in recent years. To address this challenge, we propose an adventitious respiratory sound classification network (ARSC-Net), which combines residual block with channel-spatial attention for accurate classification. Specifically, we extract two types of features from adventitious respiratory sound, including Mel-Frequency Cepstral Coefficients (MFCCs) and Mel-spectrogram. The two types of features are entered into the parallel encoders paths with residual attention for extracting feature representation, and then fused into a channel-spatial attention module to adaptively focus on the important features between channel and spatial part for the classification task. Moreover, the channel-spatial attention can enhance the feature representation, in which the channel attention explores the inter-channel relationship of the spectrums, and then the inter-spatial correlation mapping is generated by the spatial attention introduced serially. We evaluate our proposed method on ICBHI 2017 database. Experimental results show that our proposed method achieves encouraging predictive performance with an accuracy of 80.0% for identifying abnormal sounds from normal sounds, and with an accuracy of 92.4% for distinguishing crackles from wheezes. In addition, our method also achieves a score of 56.76% for the four-class sound classification of adventitious sounds and outperforms several state-of-the-art methods.

PDF

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Benchmark
Audio Classification ICBHI Respiratory Sound Database bi-ResNet-Att ICBHI Score 56.76 # 9
Sensitivity 46.38 # 1
Specificity 67.13 # 14

Methods