Search Results for author: Dick Botteldooren

Found 15 papers, 9 papers with code

Exploring Differences between Human Perception and Model Inference in Audio Event Recognition

1 code implementation10 Sep 2024 Yizhou Tan, Yanru Wu, Yuanbo Hou, Xin Xu, Hui Bu, Shengchen Li, Dick Botteldooren, Mark D. Plumbley

By comparing human annotations with the predictions of ensemble pre-trained models, this paper uncovers a significant gap between human perception and model inference in both semantic identification and existence detection of audio events.

A Dynamic Systems Approach to Modelling Human-Machine Rhythm Interaction

no code implementations26 Jun 2024 Zhongju Yuan, Wannes Van Ransbeeck, Geraint Wiggins, Dick Botteldooren

In exploring the simulation of human rhythmic perception and synchronization capabilities, this study introduces a computational model inspired by the physical and biological processes underlying rhythm processing.

Soundscape Captioning using Sound Affective Quality Network and Large Language Model

1 code implementation9 Jun 2024 Yuanbo Hou, Qiaoqiao Ren, Andrew Mitchell, Wenwu Wang, Jian Kang, Tony Belpaeme, Dick Botteldooren

The average score (out of 5) of SoundSCaper-generated captions is lower than the score of captions generated by two soundscape experts by 0. 21 and 0. 25, respectively, on the evaluation set and the model-unknown mixed external dataset with varying lengths and acoustic properties, but the differences are not statistically significant.

Language Modelling Large Language Model

A novel Reservoir Architecture for Periodic Time Series Prediction

no code implementations16 May 2024 Zhongju Yuan, Geraint Wiggins, Dick Botteldooren

Leveraging reservoir computing, our proposed method is ultimately oriented towards predicting human perception of rhythm.

Time Series Time Series Prediction

No More Mumbles: Enhancing Robot Intelligibility through Speech Adaptation

1 code implementation15 May 2024 Qiaoqiao Ren, Yuanbo Hou, Dick Botteldooren, Tony Belpaeme

For this, the robot needs to know how difficult it is for a user to understand spoken language in a particular setting.

speech-recognition Speech Recognition

EEG decoding with conditional identification information

no code implementations21 Mar 2024 Pengfei Sun, Jorg De Winne, Paul Devos, Dick Botteldooren

Decoding EEG signals is crucial for unraveling human brain and advancing brain-computer interfaces.

EEG Eeg Decoding

Multi-level graph learning for audio event classification and human-perceived annoyance rating prediction

1 code implementation15 Dec 2023 Yuanbo Hou, Qiaoqiao Ren, Siyang Song, Yuxin Song, Wenwu Wang, Dick Botteldooren

Specifically, this paper proposes a lightweight multi-level graph learning (MLGL) based on local and global semantic graphs to simultaneously perform audio event classification (AEC) and human annoyance rating prediction (ARP).

Graph Learning

AI-based soundscape analysis: Jointly identifying sound sources and predicting annoyance

1 code implementation15 Nov 2023 Yuanbo Hou, Qiaoqiao Ren, Huizhong Zhang, Andrew Mitchell, Francesco Aletta, Jian Kang, Dick Botteldooren

(4) Generalization tests show that the proposed model's ARP in the presence of model-unknown sound sources is consistent with expert expectations and can explain previous findings from the literature on sound-scape augmentation.

Delayed Memory Unit: Modelling Temporal Dependency Through Delay Gate

no code implementations23 Oct 2023 Pengfei Sun, Jibin Wu, Malu Zhang, Paul Devos, Dick Botteldooren

Recurrent Neural Networks (RNNs) are renowned for their adeptness in modeling temporal dependencies, a trait that has driven their widespread adoption for sequential data processing.

Gesture Recognition Sequential Image Classification +2

Multi-dimensional Edge-based Audio Event Relational Graph Representation Learning for Acoustic Scene Classification

1 code implementation27 Oct 2022 Yuanbo Hou, Siyang Song, Chuang Yu, Yuxin Song, Wenwu Wang, Dick Botteldooren

Experiments on a polyphonic acoustic scene dataset show that the proposed ERGL achieves competitive performance on ASC by using only a limited number of embeddings of audio events without any data augmentations.

Acoustic Scene Classification Graph Representation Learning +1

Axonal Delay As a Short-Term Memory for Feed Forward Deep Spiking Neural Networks

no code implementations20 Apr 2022 Pengfei Sun, Longwei Zhu, Dick Botteldooren

The information of spiking neural networks (SNNs) are propagated between the adjacent biological neuron by spikes, which provides a computing paradigm with the promise of simulating the human brain.

Rule-embedded network for audio-visual voice activity detection in live musical video streams

1 code implementation27 Oct 2020 Yuanbo Hou, Yi Deng, Bilei Zhu, Zejun Ma, Dick Botteldooren

Detecting anchor's voice in live musical streams is an important preprocessing for music and speech signal processing.

Sound Multimedia Audio and Speech Processing

Cannot find the paper you are looking for? You can Submit a new open access paper.