Search Results for author: Wim Boes

Found 8 papers, 0 papers with code

Impact of visual assistance for automated audio captioning

no code implementations • 18 Nov 2022 • Wim Boes, Hugo Van hamme

More specifically, visual features focusing on semantics appear appropriate in the context of automated audio captioning, while for sound event detection, time information seems to be more important.

Audio captioning Event Detection +2

Paper
Add Code

Optimizing Temporal Resolution Of Convolutional Recurrent Neural Networks For Sound Event Detection

no code implementations • 18 Oct 2022 • Wim Boes, Hugo Van hamme

This is significantly better than the performance obtained by the baseline model (0. 527), which can effectively be attributed to the changes that were applied to the pooling operations of the network.

Event Detection Sound Event Detection +1

Paper
Add Code

Multi-Source Transformer Architectures for Audiovisual Scene Classification

no code implementations • 18 Oct 2022 • Wim Boes, Hugo Van hamme

With regard to the accuracy measure, our best model achieved a score of 77. 1\% on the validation data, which is about the same as the performance obtained by the baseline system (77. 0\%).

Classification Scene Classification

Paper
Add Code

Multi-encoder attention-based architectures for sound recognition with partial visual assistance

no code implementations • 26 Sep 2022 • Wim Boes, Hugo Van hamme

Large-scale sound recognition data sets typically consist of acoustic recordings obtained from multimedia libraries.

Audio Tagging Event Detection +1

Paper
Add Code

Impact of temporal resolution on convolutional recurrent networks for audio tagging and sound event detection

no code implementations • 26 Sep 2022 • Wim Boes, Hugo Van hamme

Many state-of-the-art systems for audio tagging and sound event detection employ convolutional recurrent neural architectures.

Audio Tagging Event Detection +2

Paper
Add Code

On the long-term learning ability of LSTM LMs

no code implementations • 16 Jun 2021 • Wim Boes, Robbe Van Rompaey, Lyan Verwimp, Joris Pelemans, Hugo Van hamme, Patrick Wambacq

We inspect the long-term learning ability of Long Short-Term Memory language models (LSTM LMs) by evaluating a contextual extension based on the Continuous Bag-of-Words (CBOW) model for both sentence- and discourse-level LSTM LMs and by analyzing its performance.

Sentence

Paper
Add Code

Audiovisual transfer learning for audio tagging and sound event detection

no code implementations • 9 Jun 2021 • Wim Boes, Hugo Van hamme

We study the merit of transfer learning for two sound recognition problems, i. e., audio tagging and sound event detection.

Audio Tagging Event Detection +2

Paper
Add Code

Audiovisual Transformer Architectures for Large-Scale Classification and Synchronization of Weakly Labeled Audio Events

no code implementations • 2 Dec 2019 • Wim Boes, Hugo Van hamme

We tackle the task of environmental event classification by drawing inspiration from the transformer neural network architecture used in machine translation.

General Classification Machine Translation +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.