Search Results for author: Motoi Omachi

Found 5 papers, 0 papers with code

Speaker Selective Beamformer with Keyword Mask Estimation

no code implementations • 25 Oct 2018 • Yusuke Kida, Dung Tran, Motoi Omachi, Toru Taniguchi, Yuya Fujita

The proposed method firstly utilizes a DNN-based mask estimator to separate the mixture signal into the keyword signal uttered by the target speaker and the remaining background speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Insertion-Based Modeling for End-to-End Automatic Speech Recognition

no code implementations • 27 May 2020 • Yuya Fujita, Shinji Watanabe, Motoi Omachi, Xuankai Chan

One NAT model, mask-predict, has been applied to ASR but the model needs some heuristics or additional component to estimate the length of the output token sequence.

Audio and Speech Processing Sound

Paper
Add Code

Toward Streaming ASR with Non-Autoregressive Insertion-based Model

no code implementations • 18 Dec 2020 • Yuya Fujita, Tianzi Wang, Shinji Watanabe, Motoi Omachi

We propose a system to concatenate audio segmentation and non-autoregressive ASR to realize high accuracy and low RTF ASR.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

End-to-end ASR to jointly predict transcriptions and linguistic annotations

no code implementations • NAACL 2021 • Motoi Omachi, Yuya Fujita, Shinji Watanabe, Matthew Wiesner

We propose a Transformer-based sequence-to-sequence model for automatic speech recognition (ASR) capable of simultaneously transcribing and annotating audio with linguistic information such as phonemic transcripts or part-of-speech (POS) tags.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

Align, Write, Re-order: Explainable End-to-End Speech Translation via Operation Sequence Generation

no code implementations • 11 Nov 2022 • Motoi Omachi, Brian Yan, Siddharth Dalmia, Yuya Fujita, Shinji Watanabe

To solve this problem, we would like to simultaneously generate automatic speech recognition (ASR) and ST predictions such that each source language word is explicitly mapped to a target language word.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.