Search Results for author: Yuanbo Hou

Found 11 papers, 11 papers with code

Exploring Differences between Human Perception and Model Inference in Audio Event Recognition

1 code implementation10 Sep 2024 Yizhou Tan, Yanru Wu, Yuanbo Hou, Xin Xu, Hui Bu, Shengchen Li, Dick Botteldooren, Mark D. Plumbley

By comparing human annotations with the predictions of ensemble pre-trained models, this paper uncovers a significant gap between human perception and model inference in both semantic identification and existence detection of audio events.

Soundscape Captioning using Sound Affective Quality Network and Large Language Model

1 code implementation9 Jun 2024 Yuanbo Hou, Qiaoqiao Ren, Andrew Mitchell, Wenwu Wang, Jian Kang, Tony Belpaeme, Dick Botteldooren

To fill this gap, we propose the soundscape captioning task, which enables automated soundscape analysis, thus avoiding labour-intensive subjective ratings and surveys in conventional methods.

Language Modelling Large Language Model

No More Mumbles: Enhancing Robot Intelligibility through Speech Adaptation

1 code implementation15 May 2024 Qiaoqiao Ren, Yuanbo Hou, Dick Botteldooren, Tony Belpaeme

For this, the robot needs to know how difficult it is for a user to understand spoken language in a particular setting.

speech-recognition Speech Recognition

Multi-level graph learning for audio event classification and human-perceived annoyance rating prediction

1 code implementation15 Dec 2023 Yuanbo Hou, Qiaoqiao Ren, Siyang Song, Yuxin Song, Wenwu Wang, Dick Botteldooren

Specifically, this paper proposes a lightweight multi-level graph learning (MLGL) based on local and global semantic graphs to simultaneously perform audio event classification (AEC) and human annoyance rating prediction (ARP).

Graph Learning

AI-based soundscape analysis: Jointly identifying sound sources and predicting annoyance

1 code implementation15 Nov 2023 Yuanbo Hou, Qiaoqiao Ren, Huizhong Zhang, Andrew Mitchell, Francesco Aletta, Jian Kang, Dick Botteldooren

(4) Generalization tests show that the proposed model's ARP in the presence of model-unknown sound sources is consistent with expert expectations and can explain previous findings from the literature on sound-scape augmentation.

Multi-dimensional Edge-based Audio Event Relational Graph Representation Learning for Acoustic Scene Classification

1 code implementation27 Oct 2022 Yuanbo Hou, Siyang Song, Chuang Yu, Yuxin Song, Wenwu Wang, Dick Botteldooren

Experiments on a polyphonic acoustic scene dataset show that the proposed ERGL achieves competitive performance on ASC by using only a limited number of embeddings of audio events without any data augmentations.

Acoustic Scene Classification Graph Representation Learning +1

Rule-embedded network for audio-visual voice activity detection in live musical video streams

1 code implementation27 Oct 2020 Yuanbo Hou, Yi Deng, Bilei Zhu, Zejun Ma, Dick Botteldooren

Detecting anchor's voice in live musical streams is an important preprocessing for music and speech signal processing.

Sound Multimedia Audio and Speech Processing

Audio Tagging With Connectionist Temporal Classification Model Using Sequential Labelled Data

1 code implementation6 Aug 2018 Yuanbo Hou, Qiuqiang Kong, Shengchen Li

To use the order information of sound events, we propose sequential labelled data (SLD), where both the presence or absence and the order information of sound events are known.

Audio Tagging General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.