no code implementations • 8 Feb 2024 • Karim Helwani, Masahito Togami, Paris Smaragdis, Michael M. Goodwin
In this paper, we present a hybrid classical digital signal processing/deep neural network (DSP/DNN) approach to source separation (SS) highlighting the theoretical link between variational autoencoder and classical approaches to SS.
no code implementations • 1 Feb 2024 • Masahito Togami, Jean-Marc Valin, Karim Helwani, Ritwik Giri, Umut Isik, Michael M. Goodwin
The algorithm runs in real-time on 10-ms frames with a 40 ms of look-ahead.
1 code implementation • 2 Jun 2021 • Robin Scheibler, Masahito Togami
We propose a generalized formulation of direction of arrival estimation that includes many existing methods such as steered response power, subspace, coherent and incoherent, as well as speech sparsity-based methods.
no code implementations • 21 Apr 2021 • Yusuke Kida, Tatsuya Komatsu, Masahito Togami
The speech-to-text alignment is a problem of splitting long audio recordings with un-aligned transcripts into utterance-wise pairs of speech and text.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 12 Feb 2021 • Taishi Nakashima, Robin Scheibler, Masahito Togami, Nobutaka Ono
In this case, we manage to reduce the number of matrix inversion to only one per iteration and source.
no code implementations • 11 Nov 2020 • Robin Scheibler, Masahito Togami
We find that the learnt approximate surrogate generalizes well on mixtures of three and four speakers without any modification.
no code implementations • WS 2018 • Takeshi Homma, Adriano S. Arantes, Maria Teresa Gonzalez Diaz, Masahito Togami
Therefore, the purpose of this study is to maximize SLU performances, especially for small training data sets.