Search Results for author: Shoko Araki

Found 17 papers, 4 papers with code

Mask-based Neural Beamforming for Moving Speakers with Self-Attention-based Tracking

no code implementations7 May 2022 Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Shoko Araki

We thus introduce a learning-based framework that computes optimal attention weights for beamforming.

Blind and neural network-guided convolutional beamformer for joint denoising, dereverberation, and source separation

no code implementations4 Aug 2021 Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Hiroshi Sawada, Shoko Araki

This paper proposes an approach for optimizing a Convolutional BeamFormer (CBF) that can jointly perform denoising (DN), dereverberation (DR), and source separation (SS).

Automatic Speech Recognition Denoising

Few-shot learning of new sound classes for target sound extraction

no code implementations14 Jun 2021 Marc Delcroix, Jorge Bennasar Vázquez, Tsubasa Ochiai, Keisuke Kinoshita, Shoko Araki

Target sound extraction consists of extracting the sound of a target acoustic event (AE) class from a mixture of AE sounds.

Few-Shot Learning

PILOT: Introducing Transformers for Probabilistic Sound Event Localization

1 code implementation7 Jun 2021 Christopher Schymura, Benedikt Bönninghoff, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Dorothea Kolossa

Sound event localization aims at estimating the positions of sound sources in the environment with respect to an acoustic receiver (e. g. a microphone array).

Event Detection

Comparison of remote experiments using crowdsourcing and laboratory experiments on speech intelligibility

no code implementations17 Apr 2021 Ayako Yamamoto, Toshio Irino, Kenichi Arai, Shoko Araki, Atsunori Ogawa, Keisuke Kinoshita, Tomohiro Nakatani

Many subjective experiments have been performed to develop objective speech intelligibility measures, but the novel coronavirus outbreak has made it very difficult to conduct experiments in a laboratory.

Speech Enhancement

Exploiting Attention-based Sequence-to-Sequence Architectures for Sound Event Localization

1 code implementation28 Feb 2021 Christopher Schymura, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Dorothea Kolossa

Herein, attentions allow for capturing temporal dependencies in the audio signal by focusing on specific frames that are relevant for estimating the activity and direction-of-arrival of sound events at the current time-step.

Automatic Speech Recognition

Multimodal Attention Fusion for Target Speaker Extraction

no code implementations2 Feb 2021 Hiroshi Sato, Tsubasa Ochiai, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Shoko Araki

Recently an audio-visual target speaker extraction has been proposed that extracts target speech by using complementary audio and visual clues.

Neural Network-based Virtual Microphone Estimator

no code implementations12 Jan 2021 Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita, Keisuke Kinoshita, Shoko Araki

Developing microphone array technologies for a small number of microphones is important due to the constraints of many devices.

Speech Enhancement

Block Coordinate Descent Algorithms for Auxiliary-Function-Based Independent Vector Extraction

no code implementations18 Oct 2020 Rintaro Ikeshita, Tomohiro Nakatani, Shoko Araki

We also newly develop a BCD for a semiblind IVE in which the transfer functions for several super-Gaussian sources are given a priori.

Listen to What You Want: Neural Network-based Universal Sound Selector

no code implementations10 Jun 2020 Tsubasa Ochiai, Marc Delcroix, Yuma Koizumi, Hiroaki Ito, Keisuke Kinoshita, Shoko Araki

In this paper, we propose instead a universal sound selection neural network that enables to directly select AE sounds from a mixture given user-specified target AE classes.

Tackling real noisy reverberant meetings with all-neural source separation, counting, and diarization system

no code implementations9 Mar 2020 Keisuke Kinoshita, Marc Delcroix, Shoko Araki, Tomohiro Nakatani

Automatic meeting analysis is an essential fundamental technology required to let, e. g. smart devices follow and respond to our conversations.

Speaker Diarization Speech Enhancement

Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam

1 code implementation23 Jan 2020 Marc Delcroix, Tsubasa Ochiai, Katerina Zmolikova, Keisuke Kinoshita, Naohiro Tawara, Tomohiro Nakatani, Shoko Araki

First, we propose a time-domain implementation of SpeakerBeam similar to that proposed for a time-domain audio separation network (TasNet), which has achieved state-of-the-art performance for speech separation.

Speaker Identification Speech Extraction

All-neural online source separation, counting, and diarization for meeting analysis

no code implementations21 Feb 2019 Thilo von Neumann, Keisuke Kinoshita, Marc Delcroix, Shoko Araki, Tomohiro Nakatani, Reinhold Haeb-Umbach

While significant progress has been made on individual tasks, this paper presents for the first time an all-neural approach to simultaneous speaker counting, diarization and source separation.

Automatic Speech Recognition Speaker Diarization

Cannot find the paper you are looking for? You can Submit a new open access paper.