no code implementations • 27 Aug 2021 • Qiongqiong Wang, Kong Aik Lee, Takafumi Koshinaka, Koji Okabe, Hitoshi Yamamoto
It is easy to apply the proposed dual-warping factors approach to any mask-based SE method, and it allows a single SE system to handle multiple tasks without task-dependent training.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 12 Aug 2021 • Kong Aik Lee, Qiongqiong Wang, Takafumi Koshinaka
We present a Bayesian formulation for deep speaker embedding, wherein the xi-vector is the Bayesian counterpart of the x-vector, taking into account the uncertainty estimate.
no code implementations • 20 Jun 2019 • Ville Vestman, Kong Aik Lee, Tomi H. Kinnunen, Takafumi Koshinaka
In particular, we achieve an acceleration of 3000 times in frame posterior computation compared to real time and 25 times in training the i-vector extractor compared to the CPU baseline from Kaldi toolkit.
no code implementations • 16 Apr 2019 • Kong Aik Lee, Ville Hautamaki, Tomi Kinnunen, Hitoshi Yamamoto, Koji Okabe, Ville Vestman, Jing Huang, Guohong Ding, Hanwu Sun, Anthony Larcher, Rohan Kumar Das, Haizhou Li, Mickael Rouvier, Pierre-Michel Bousquet, Wei Rao, Qing Wang, Chunlei Zhang, Fahimeh Bahmaninezhad, Hector Delgado, Jose Patino, Qiongqiong Wang, Ling Guo, Takafumi Koshinaka, Jiacen Zhang, Koichi Shinoda, Trung Ngo Trong, Md Sahidullah, Fan Lu, Yun Tang, Ming Tu, Kah Kuan Teh, Huy Dat Tran, Kuruvachan K. George, Ivan Kukanov, Florent Desnous, Jichen Yang, Emre Yilmaz, Longting Xu, Jean-Francois Bonastre, Cheng-Lin Xu, Zhi Hao Lim, Eng Siong Chng, Shivesh Ranjan, John H. L. Hansen, Massimiliano Todisco, Nicholas Evans
The I4U consortium was established to facilitate a joint entry to NIST speaker recognition evaluations (SRE).
no code implementations • 26 Dec 2018 • Kong Aik Lee, Qiongqiong Wang, Takafumi Koshinaka
We refer to the model-based adaptation technique proposed in this paper as CORAL+.