Search Results for author: Parthasaarathy Sudarsanam

Found 3 papers, 2 papers with code

STARSS22: A dataset of spatial recordings of real scenes with spatiotemporal annotations of sound events

2 code implementations4 Jun 2022 Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Sharath Adavanne, Daniel Krause, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji, Tuomas Virtanen

Additionally, the report presents the baseline system that accompanies the dataset in the challenge with emphasis on the differences with the baseline of the previous iterations; namely, introduction of the multi-ACCDOA representation to handle multiple simultaneous occurences of events of the same class, and support for additional improved input features for the microphone array format.

Sound Event Localization and Detection

Clotho-AQA: A Crowdsourced Dataset for Audio Question Answering

no code implementations20 Apr 2022 Samuel Lipping, Parthasaarathy Sudarsanam, Konstantinos Drossos, Tuomas Virtanen

Audio question answering (AQA) is a multimodal translation task where a system analyzes an audio signal and a natural language question, to generate a desirable natural language answer.

Audio Question Answering Question Answering

Improving Voice Separation by Incorporating End-to-end Speech Recognition

1 code implementation29 Nov 2019 Naoya Takahashi, Mayank Kumar Singh, Sakya Basak, Parthasaarathy Sudarsanam, Sriram Ganapathy, Yuki Mitsufuji

Despite recent advances in voice separation methods, many challenges remain in realistic scenarios such as noisy recording and the limits of available data.

Automatic Speech Recognition speech-recognition +2

Cannot find the paper you are looking for? You can Submit a new open access paper.