1 code implementation • NeurIPS 2023 • Kazuki Shimada, Archontis Politis, Parthasaarathy Sudarsanam, Daniel Krause, Kengo Uchida, Sharath Adavanne, Aapo Hakala, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen, Yuki Mitsufuji
While direction of arrival (DOA) of sound events is generally estimated from multichannel audio data recorded in a microphone array, sound events usually derive from visually perceptible source objects, e. g., sounds of footsteps come from the feet of a walker.
no code implementations • 18 May 2023 • Hao Shi, Kazuki Shimada, Masato Hirano, Takashi Shibuya, Yuichiro Koyama, Zhi Zhong, Shusuke Takahashi, Tatsuya Kawahara, Yuki Mitsufuji
At the decoded feature level, we fuse the two decoded features by generative and predictive decoders.
no code implementations • 10 May 2023 • Masato Hirano, Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Yuki Mitsufuji
We experimentally show that our refiner can provide a clearer harmonic structure of speech and improves the reference-free metric of perceptual quality for arbitrary preceding model architectures.
2 code implementations • 4 Jun 2022 • Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Sharath Adavanne, Daniel Krause, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji, Tuomas Virtanen
Additionally, the report presents the baseline system that accompanies the dataset in the challenge with emphasis on the differences with the baseline of the previous iterations; namely, introduction of the multi-ACCDOA representation to handle multiple simultaneous occurences of events of the same class, and support for additional improved input features for the microphone array format.
Ranked #1 on Sound Event Localization and Detection on STARSS22
no code implementations • 3 Feb 2022 • Johannes Imort, Giorgio Fabbro, Marco A. Martínez Ramírez, Stefan Uhlich, Yuichiro Koyama, Yuki Mitsufuji
Given the recent advances in music source separation and automatic mixing, removing audio effects in music tracks is a meaningful step toward developing an automated remixing system.
2 code implementations • 14 Oct 2021 • Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Naoya Takahashi, Emiru Tsunoo, Yuki Mitsufuji
The multi- ACCDOA format (a class- and track-wise output format) enables the model to solve the cases with overlaps from the same class.
1 code implementation • 12 Oct 2021 • Ricardo Falcon-Perez, Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Yuki Mitsufuji
Data augmentation methods have shown great importance in diverse supervised learning problems where labeled data is scarce or costly to obtain.
no code implementations • 21 Jun 2021 • Kazuki Shimada, Naoya Takahashi, Yuichiro Koyama, Shusuke Takahashi, Emiru Tsunoo, Masafumi Takahashi, Yuki Mitsufuji
This report describes our systems submitted to the DCASE2021 challenge task 3: sound event localization and detection (SELD) with directional interference.
2 code implementations • 29 Oct 2020 • Kazuki Shimada, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji
Conventional NN-based methods use two branches for a sound event detection (SED) target and a direction-of-arrival (DOA) target.
no code implementations • 30 Oct 2019 • Kazuki Shimada, Yuichiro Koyama, Akira Inoue
Few-shot learning systems for sound event recognition have gained interests since they require only a few examples to adapt to new target classes without fine-tuning.