Search Results for author: Wim Boes

Found 8 papers, 0 papers with code

Impact of visual assistance for automated audio captioning

no code implementations18 Nov 2022 Wim Boes, Hugo Van hamme

More specifically, visual features focusing on semantics appear appropriate in the context of automated audio captioning, while for sound event detection, time information seems to be more important.

Audio captioning Event Detection +2

Optimizing Temporal Resolution Of Convolutional Recurrent Neural Networks For Sound Event Detection

no code implementations18 Oct 2022 Wim Boes, Hugo Van hamme

This is significantly better than the performance obtained by the baseline model (0. 527), which can effectively be attributed to the changes that were applied to the pooling operations of the network.

Event Detection Sound Event Detection +1

Multi-Source Transformer Architectures for Audiovisual Scene Classification

no code implementations18 Oct 2022 Wim Boes, Hugo Van hamme

With regard to the accuracy measure, our best model achieved a score of 77. 1\% on the validation data, which is about the same as the performance obtained by the baseline system (77. 0\%).

Classification Scene Classification

Multi-encoder attention-based architectures for sound recognition with partial visual assistance

no code implementations26 Sep 2022 Wim Boes, Hugo Van hamme

Large-scale sound recognition data sets typically consist of acoustic recordings obtained from multimedia libraries.

Audio Tagging Event Detection +1

Impact of temporal resolution on convolutional recurrent networks for audio tagging and sound event detection

no code implementations26 Sep 2022 Wim Boes, Hugo Van hamme

Many state-of-the-art systems for audio tagging and sound event detection employ convolutional recurrent neural architectures.

Audio Tagging Event Detection +2

On the long-term learning ability of LSTM LMs

no code implementations16 Jun 2021 Wim Boes, Robbe Van Rompaey, Lyan Verwimp, Joris Pelemans, Hugo Van hamme, Patrick Wambacq

We inspect the long-term learning ability of Long Short-Term Memory language models (LSTM LMs) by evaluating a contextual extension based on the Continuous Bag-of-Words (CBOW) model for both sentence- and discourse-level LSTM LMs and by analyzing its performance.

Sentence

Audiovisual transfer learning for audio tagging and sound event detection

no code implementations9 Jun 2021 Wim Boes, Hugo Van hamme

We study the merit of transfer learning for two sound recognition problems, i. e., audio tagging and sound event detection.

Audio Tagging Event Detection +2

Audiovisual Transformer Architectures for Large-Scale Classification and Synchronization of Weakly Labeled Audio Events

no code implementations2 Dec 2019 Wim Boes, Hugo Van hamme

We tackle the task of environmental event classification by drawing inspiration from the transformer neural network architecture used in machine translation.

General Classification Machine Translation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.