Search Results for author: Archontis Politis

Found 23 papers, 10 papers with code

Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks

8 code implementations • 30 Jun 2018 • Sharath Adavanne, Archontis Politis, Joonas Nikunen, Tuomas Virtanen

In this paper, we propose a convolutional recurrent neural network for joint sound event localization and detection (SELD) of multiple overlapping sound events in three-dimensional (3D) space.

Sound Audio and Speech Processing

312

Paper
Code

Localization, Detection and Tracking of Multiple Moving Sound Sources with a Convolutional Recurrent Neural Network

1 code implementation • 29 Apr 2019 • Sharath Adavanne, Archontis Politis, Tuomas Virtanen

This paper investigates the joint localization, detection, and tracking of sound events using a convolutional recurrent neural network (CRNN).

312

Paper
Code

A multi-room reverberant dataset for sound event localization and detection

3 code implementations • 21 May 2019 • Sharath Adavanne, Archontis Politis, Tuomas Virtanen

This paper presents the sound event localization and detection (SELD) task setup for the DCASE 2019 challenge.

Sound Audio and Speech Processing

Paper
Code

A Dataset of Reverberant Spatial Sound Scenes with Moving Sources for Sound Event Localization and Detection

2 code implementations • 2 Jun 2020 • Archontis Politis, Sharath Adavanne, Tuomas Virtanen

This report presents the dataset and the evaluation setup of the Sound Event Localization & Detection (SELD) task for the DCASE 2020 Challenge.

Sound Event Localization and Detection

Paper
Code

Overview and Evaluation of Sound Event Localization and Detection in DCASE 2019

4 code implementations • 6 Sep 2020 • Archontis Politis, Annamaria Mesaros, Sharath Adavanne, Toni Heittola, Tuomas Virtanen

A large-scale realistic dataset of spatialized sound events was generated for the challenge, to be used for training of learning-based approaches, and for evaluation of the submissions in an unlabeled subset.

Data Augmentation Sound Event Localization and Detection

Paper
Code

STARSS22: A dataset of spatial recordings of real scenes with spatiotemporal annotations of sound events

2 code implementations • 4 Jun 2022 • Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Sharath Adavanne, Daniel Krause, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji, Tuomas Virtanen

Additionally, the report presents the baseline system that accompanies the dataset in the challenge with emphasis on the differences with the baseline of the previous iterations; namely, introduction of the multi-ACCDOA representation to handle multiple simultaneous occurences of events of the same class, and support for additional improved input features for the microphone array format.

Ranked #1 on Sound Event Localization and Detection on STARSS22

Sound Event Localization and Detection

Paper
Code

A Dataset of Dynamic Reverberant Sound Scenes with Directional Interferers for Sound Event Localization and Detection

1 code implementation • 13 Jun 2021 • Archontis Politis, Sharath Adavanne, Daniel Krause, Antoine Deleforge, Prerak Srivastava, Tuomas Virtanen

This report presents the dataset and baseline of Task 3 of the DCASE2021 Challenge on Sound Event Localization and Detection (SELD).

Sound Event Localization and Detection

Paper
Code

Differentiable Tracking-Based Training of Deep Learning Sound Source Localizers

2 code implementations • 29 Oct 2021 • Sharath Adavanne, Archontis Politis, Tuomas Virtanen

Data-based and learning-based sound source localization (SSL) has shown promising results in challenging conditions, and is commonly set as a classification or a regression problem.

Classification Direction of Arrival Estimation +2

Paper
Code

STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events

1 code implementation • NeurIPS 2023 • Kazuki Shimada, Archontis Politis, Parthasaarathy Sudarsanam, Daniel Krause, Kengo Uchida, Sharath Adavanne, Aapo Hakala, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen, Yuki Mitsufuji

While direction of arrival (DOA) of sound events is generally estimated from multichannel audio data recorded in a microphone array, sound events usually derive from visually perceptible source objects, e. g., sounds of footsteps come from the feet of a walker.

Sound Event Localization and Detection

Paper
Code

Speaker Distance Estimation in Enclosures from Single-Channel Audio

1 code implementation • 26 Mar 2024 • Michael Neri, Archontis Politis, Daniel Krause, Marco Carli, Tuomas Virtanen

Distance estimation from audio plays a crucial role in various applications, such as acoustic scene analysis, sound source localization, and room modeling.

Paper
Code

Multichannel Sound Event Detection Using 3D Convolutional Neural Networks for Learning Inter-channel Features

no code implementations • 29 Jan 2018 • Sharath Adavanne, Archontis Politis, Tuomas Virtanen

Each of this dataset has a four-channel first-order Ambisonic, binaural, and single-channel versions, on which the performance of SED using the proposed method are compared to study the potential of SED using multichannel audio.

Event Detection Sound Event Detection

Paper
Add Code

Direction of arrival estimation for multiple sound sources using convolutional recurrent neural network

no code implementations • 27 Oct 2017 • Sharath Adavanne, Archontis Politis, Tuomas Virtanen

This paper proposes a deep neural network for estimating the directions of arrival (DOA) of multiple sound sources.

Direction of Arrival Estimation

Paper
Add Code

Deep neural network Based Low-latency Speech Separation with Asymmetric analysis-Synthesis Window Pair

no code implementations • 22 Jun 2021 • Shanshan Wang, Gaurav Naithani, Archontis Politis, Tuomas Virtanen

Time-frequency masking or spectrum prediction computed via short symmetric windows are commonly used in low-latency deep neural network (DNN) based source separation.

Clustering Deep Clustering +2

Paper
Add Code

Mobile Microphone Array Speech Detection and Localization in Diverse Everyday Environments

no code implementations • 28 Jun 2021 • Pasi Pertilä, Emre Cakir, Aapo Hakala, Eemi Fagerlund, Tuomas Virtanen, Archontis Politis, Antti Eronen

Joint sound event localization and detection (SELD) is an integral part of developing context awareness into communication interfaces of mobile robots, smartphones, and home assistants.

Sound Event Localization and Detection

Paper
Add Code

Joint Direction and Proximity Classification of Overlapping Sound Events from Binaural Audio

no code implementations • 26 Jul 2021 • Daniel Aleksander Krause, Archontis Politis, Annamaria Mesaros

Finally, we propose various ways of combining the proximity and direction estimation problems into a joint task providing temporal information about the onsets and offsets of the appearing sources.

Paper
Add Code

Self-supervised Learning of Audio Representations from Audio-Visual Data using Spatial Alignment

no code implementations • 2 Jun 2022 • Shanshan Wang, Archontis Politis, Annamaria Mesaros, Tuomas Virtanen

In addition to the correspondence, AVSA also learns from the spatial location of acoustic and visual content.

Acoustic Scene Classification Action Recognition +6

Paper
Add Code

Position tracking of a varying number of sound sources with sliding permutation invariant training

no code implementations • 26 Oct 2022 • David Diaz-Guerra, Archontis Politis, Tuomas Virtanen

Recent data- and learning-based sound source localization (SSL) methods have shown strong performance in challenging acoustic scenarios.

Position

Paper
Add Code

Multi-Channel Masking with Learnable Filterbank for Sound Source Separation

no code implementations • 14 Mar 2023 • Wang Dai, Archontis Politis, Tuomas Virtanen

Specifically, each mask is used to multiply the corresponding channel's 2D representation, and the masked output of all channels are then summed.

Paper
Add Code

Permutation Invariant Recurrent Neural Networks for Sound Source Tracking Applications

no code implementations • 14 Jun 2023 • David Diaz-Guerra, Archontis Politis, Antonio Miguel, Jose R. Beltran, Tuomas Virtanen

Conventional recurrent neural networks (RNNs), such as the long short-term memories (LSTMs) or the gated recurrent units (GRUs), take a vector as their input and use another vector to store their state.

Paper
Add Code

Attention-Driven Multichannel Speech Enhancement in Moving Sound Source Scenarios

no code implementations • 17 Dec 2023 • Yuzhu Wang, Archontis Politis, Tuomas Virtanen

The clean speech clips from WSJ0 are employed for simulating speech signals of moving speakers in a reverberant environment.

Speech Enhancement

Paper
Add Code

Neural Ambisonics encoding for compact irregular microphone arrays

no code implementations • 11 Jan 2024 • Mikko Heikkinen, Archontis Politis, Tuomas Virtanen

Ambisonics encoding of microphone array signals can enable various spatial audio applications, such as virtual reality or telepresence, but it is typically designed for uniformly-spaced spherical microphone arrays.

Paper
Add Code

Perceptually-motivated Spatial Audio Codec for Higher-Order Ambisonics Compression

no code implementations • 24 Jan 2024 • Christoph Hold, Leo McCormack, Archontis Politis, Ville Pulkki

Scene-based spatial audio formats, such as Ambisonics, are playback system agnostic and may therefore be favoured for delivering immersive audio experiences to a wide range of (potentially unknown) devices.

Paper
Add Code

Sound Event Detection and Localization with Distance Estimation

no code implementations • 18 Mar 2024 • Daniel Aleksander Krause, Archontis Politis, Annamaria Mesaros

Sound Event Detection and Localization (SELD) is a combined task of identifying sound events and their corresponding direction-of-arrival (DOA).

Event Detection Sound Event Detection

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.