no code implementations • 27 May 2022 • Soheil Khorram, Jaeyoung Kim, Anshuman Tripathi, Han Lu, Qian Zhang, Hasim Sak
This paper introduces contrastive siamese (c-siam) network, an architecture for leveraging unlabeled acoustic data in speech recognition.
no code implementations • 15 Oct 2019 • Salar Jafarlou, Soheil Khorram, Vinay Kothapally, John H. L. Hansen
In the present study, we address this issue by investigating variants of large receptive field CNNs (LRF-CNNs) which include deeply recursive networks, dilated convolutional neural networks, and stacked hourglass networks.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 1 Oct 2019 • Shahram Ghorbani, Soheil Khorram, John H. L. Hansen
An obvious approach to leverage data from a new domain (e. g., new accented speech) is to first generate a comprehensive dataset of all domains, by combining all available data, and then use this dataset to retrain the acoustic models.
no code implementations • 4 Aug 2019 • Midia Yousefi, Soheil Khorram, John H. L. Hansen
Recently proposed Permutation Invariant Training (PIT) addresses this problem by determining the output-label assignment which minimizes the separation error.
no code implementations • 5 Jul 2019 • Soheil Khorram, Melvin G McInnis, Emily Mower Provost
To deal with this challenge, we introduce a new convolutional neural network (multi-delay sinc network) that is able to simultaneously align and predict labels in an end-to-end manner.
no code implementations • 3 Jul 2019 • Nursadul Mamun, Soheil Khorram, John H. L. Hansen
To improve speech enhancement methods for CI users, we propose to perform speech enhancement in a cochlear filter-bank feature space, a feature-set specifically designed for CI users based on CI auditory stimuli.
1 code implementation • 21 Mar 2019 • Soheil Khorram, Melvin G McInnis, Emily Mower Provost
We introduce trainable time warping (TTW), whose complexity is linear in both the number and the length of time-series.
no code implementations • 23 Aug 2017 • Soheil Khorram, Zakaria Aldeneh, Dimitrios Dimitriadis, Melvin McInnis, Emily Mower Provost
The goal of continuous emotion recognition is to assign an emotion value to every frame in a sequence of acoustic features.
1 code implementation • 10 Jun 2017 • John Gideon, Soheil Khorram, Zakaria Aldeneh, Dimitrios Dimitriadis, Emily Mower Provost
Many paralinguistic tasks are closely related and thus representations learned in one domain can be leveraged for another.