no code implementations • 12 Aug 2022 • Peiran Yan, Shengchen Li
In this paper, a series of pre-trained models are investigated for the correlation between extracted audio features and the performance of audio captioning.
Audio captioning