Search Results for author: Takuya Higuchi

Found 7 papers, 2 papers with code

Rethinking Non-Negative Matrix Factorization with Implicit Neural Representations

1 code implementation • 5 Apr 2024 • Krishna Subramani, Paris Smaragdis, Takuya Higuchi, Mehrez Souden

Non-negative Matrix Factorization (NMF) is a powerful technique for analyzing regularly-sampled data, i. e., data that can be stored in a matrix.

Paper
Code

ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models

2 code implementations • 30 Jan 2024 • Jee-weon Jung, Wangyou Zhang, Jiatong Shi, Zakaria Aldeneh, Takuya Higuchi, Barry-John Theobald, Ahmed Hussen Abdelaziz, Shinji Watanabe

First, we provide an open-source platform for researchers in the speaker recognition community to effortlessly build models.

Ranked #1 on Speaker Verification on VoxCeleb (using extra training data)

Self-Supervised Learning Speaker Recognition +1

7,858

Paper
Code

Does Single-channel Speech Enhancement Improve Keyword Spotting Accuracy? A Case Study

no code implementations • 27 Sep 2023 • Avamarie Brueggeman, Takuya Higuchi, Masood Delfarah, Stephen Shum, Vineet Garg

Our investigation reveals that SE can improve KWS accuracy on noisy speech when the backend model is trained on clean speech; however, despite our extensive exploration, it is difficult to improve the KWS accuracy with SE when the backend is trained on noisy speech.

Automatic Speech Recognition Keyword Spotting +3

Paper
Add Code

Multichannel Voice Trigger Detection Based on Transform-average-concatenate

no code implementations • 27 Sep 2023 • Takuya Higuchi, Avamarie Brueggeman, Masood Delfarah, Stephen Shum

Voice triggering (VT) enables users to activate their devices by just speaking a trigger phrase.

Speech Enhancement

Paper
Add Code

Improving Voice Trigger Detection with Metric Learning

no code implementations • 5 Apr 2022 • Prateeth Nayak, Takuya Higuchi, Anmol Gupta, Shivesh Ranjan, Stephen Shum, Siddharth Sigtia, Erik Marchi, Varun Lakshminarasimhan, Minsik Cho, Saurabh Adya, Chandra Dhir, Ahmed Tewfik

A detector is typically trained on speech data independent of speaker information and used for the voice trigger detection task.

Metric Learning

Paper
Add Code

Multi-task Learning with Cross Attention for Keyword Spotting

no code implementations • 15 Jul 2021 • Takuya Higuchi, Anmol Gupta, Chandra Dhir

In this approach, an output of an acoustic model is split into two branches for the two tasks, one for phoneme transcription trained with the ASR data and one for keyword classification trained with the KWS data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Dynamic curriculum learning via data parameters for noise robust keyword spotting

no code implementations • 18 Feb 2021 • Takuya Higuchi, Shreyas Saxena, Mehrez Souden, Tien Dung Tran, Masood Delfarah, Chandra Dhir

We propose dynamic curriculum learning via data parameters for noise robust keyword spotting.

Keyword Spotting

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.