1 code implementation • 10 Sep 2024 • Yizhou Tan, Yanru Wu, Yuanbo Hou, Xin Xu, Hui Bu, Shengchen Li, Dick Botteldooren, Mark D. Plumbley
By comparing human annotations with the predictions of ensemble pre-trained models, this paper uncovers a significant gap between human perception and model inference in both semantic identification and existence detection of audio events.
1 code implementation • 9 Jun 2024 • Yuanbo Hou, Qiaoqiao Ren, Andrew Mitchell, Wenwu Wang, Jian Kang, Tony Belpaeme, Dick Botteldooren
To fill this gap, we propose the soundscape captioning task, which enables automated soundscape analysis, thus avoiding labour-intensive subjective ratings and surveys in conventional methods.
1 code implementation • 15 May 2024 • Qiaoqiao Ren, Yuanbo Hou, Dick Botteldooren, Tony Belpaeme
For this, the robot needs to know how difficult it is for a user to understand spoken language in a particular setting.
2 code implementations • 6 Feb 2024 • Qinliang Lin, Cheng Luo, Zenghao Niu, Xilin He, Weicheng Xie, Yuanbo Hou, Linlin Shen, Siyang Song
Adversarial examples generated by a surrogate model typically exhibit limited transferability to unknown target systems.
1 code implementation • 15 Dec 2023 • Yuanbo Hou, Qiaoqiao Ren, Siyang Song, Yuxin Song, Wenwu Wang, Dick Botteldooren
Specifically, this paper proposes a lightweight multi-level graph learning (MLGL) based on local and global semantic graphs to simultaneously perform audio event classification (AEC) and human annoyance rating prediction (ARP).
1 code implementation • 15 Nov 2023 • Yuanbo Hou, Qiaoqiao Ren, Huizhong Zhang, Andrew Mitchell, Francesco Aletta, Jian Kang, Dick Botteldooren
(4) Generalization tests show that the proposed model's ARP in the presence of model-unknown sound sources is consistent with expert expectations and can explain previous findings from the literature on sound-scape augmentation.
1 code implementation • 5 Oct 2023 • Yuanbo Hou, Siyang Song, Chuang Yu, Wenwu Wang, Dick Botteldooren
The results show the feasibility of recognizing diverse acoustic scenes based on the audio event-relational graph.
Acoustic Scene Classification Graph Representation Learning +1
1 code implementation • 23 Aug 2023 • Yuanbo Hou, Siyang Song, Cheng Luo, Andrew Mitchell, Qiaoqiao Ren, Weicheng Xie, Jian Kang, Wenwu Wang, Dick Botteldooren
Sound events in daily life carry rich information about the objective world.
1 code implementation • 27 Oct 2022 • Yuanbo Hou, Siyang Song, Chuang Yu, Yuxin Song, Wenwu Wang, Dick Botteldooren
Experiments on a polyphonic acoustic scene dataset show that the proposed ERGL achieves competitive performance on ASC by using only a limited number of embeddings of audio events without any data augmentations.
Acoustic Scene Classification Graph Representation Learning +1
1 code implementation • 27 Oct 2020 • Yuanbo Hou, Yi Deng, Bilei Zhu, Zejun Ma, Dick Botteldooren
Detecting anchor's voice in live musical streams is an important preprocessing for music and speech signal processing.
Sound Multimedia Audio and Speech Processing
1 code implementation • 6 Aug 2018 • Yuanbo Hou, Qiuqiang Kong, Shengchen Li
To use the order information of sound events, we propose sequential labelled data (SLD), where both the presence or absence and the order information of sound events are known.