Search Results for author: John H. L. Hansen

Found 22 papers, 1 papers with code

Single-channel speech separation using Soft-minimum Permutation Invariant Training

no code implementations16 Nov 2021 Midia Yousefi, John H. L. Hansen

A long-lasting problem in supervised speech separation is finding the correct label for each separated speech signal, referred to as label permutation ambiguity.

Speech Separation

Real-time Speaker counting in a cocktail party scenario using Attention-guided Convolutional Neural Network

no code implementations30 Oct 2021 Midia Yousefi, John H. L. Hansen

Most current speech technology systems are designed to operate well even in the presence of multiple active speakers.

Scenario Aware Speech Recognition: Advancements for Apollo Fearless Steps & CHiME-4 Corpora

no code implementations23 Sep 2021 Szu-Jui Chen, Wei Xia, John H. L. Hansen

With additional techniques such as pronunciation and silence probability modeling, plus multi-style training, we achieve a +5. 42% and +3. 18% relative WER improvement for the development and evaluation sets of the Fearless Steps Corpus.

Speech Recognition

DEAAN: Disentangled Embedding and Adversarial Adaptation Network for Robust Speaker Representation Learning

no code implementations12 Dec 2020 Mufan Sang, Wei Xia, John H. L. Hansen

Despite speaker verification has achieved significant performance improvement with the development of deep neural networks, domain mismatch is still a challenging problem in this field.

Domain Adaptation Representation Learning +1

Respiratory Distress Detection from Telephone Speech using Acoustic and Prosodic Features

no code implementations15 Nov 2020 Meemnur Rashid, Kaisar Ahmed Alman, Khaled Hasan, John H. L. Hansen, Taufiq Hasan

To capture these variations, we utilize a set of well-known acoustic and prosodic features with a Support Vector Machine (SVM) classifier for detecting the presence of respiratory distress.

Open-set Short Utterance Forensic Speaker Verification using Teacher-Student Network with Explicit Inductive Bias

no code implementations21 Sep 2020 Mufan Sang, Wei Xia, John H. L. Hansen

In forensic applications, it is very common that only small naturalistic datasets consisting of short utterances in complex or unknown acoustic environments are available.

Fine-tuning Knowledge Distillation +1

Cross-domain Adaptation with Discrepancy Minimization for Text-independent Forensic Speaker Verification

no code implementations5 Sep 2020 Zhenyu Wang, Wei Xia, John H. L. Hansen

Forensic audio analysis for speaker verification offers unique challenges due to location/scenario uncertainty and diversity mismatch between reference and naturalistic field recordings.

Domain Adaptation Speaker Verification

Speaker Representation Learning using Global Context Guided Channel and Time-Frequency Transformations

no code implementations2 Sep 2020 Wei Xia, John H. L. Hansen

In this study, we propose the global context guided channel and time-frequency transformations to model the long-range, non-local time-frequency dependencies and channel variances in speaker representations.

Representation Learning Speaker Verification

Sensor Fusion of Camera and Cloud Digital Twin Information for Intelligent Vehicles

no code implementations8 Jul 2020 Yongkang Liu, Ziran Wang, Kyungtae Han, Zhenyu Shou, Prashant Tiwari, John H. L. Hansen

With the rapid development of intelligent vehicles and Advanced Driving Assistance Systems (ADAS), a mixed level of human driver engagements is involved in the transportation system.

Sensor Fusion

A Unified Framework for Speech Separation

no code implementations17 Dec 2019 Fahimeh Bahmaninezhad, Shi-Xiong Zhang, Yong Xu, Meng Yu, John H. L. Hansen, Dong Yu

The initial solutions introduced for deep learning based speech separation analyzed the speech signals into time-frequency domain with STFT; and then encoded mixed signals were fed into a deep neural network based separator.

Speech Separation

Analyzing Large Receptive Field Convolutional Networks for Distant Speech Recognition

no code implementations15 Oct 2019 Salar Jafarlou, Soheil Khorram, Vinay Kothapally, John H. L. Hansen

In the present study, we address this issue by investigating variants of large receptive field CNNs (LRF-CNNs) which include deeply recursive networks, dilated convolutional neural networks, and stacked hourglass networks.

automatic-speech-recognition Distant Speech Recognition

Domain Expansion in DNN-based Acoustic Models for Robust Speech Recognition

no code implementations1 Oct 2019 Shahram Ghorbani, Soheil Khorram, John H. L. Hansen

An obvious approach to leverage data from a new domain (e. g., new accented speech) is to first generate a comprehensive dataset of all domains, by combining all available data, and then use this dataset to retrain the acoustic models.

Robust Speech Recognition

Probabilistic Permutation Invariant Training for Speech Separation

no code implementations4 Aug 2019 Midia Yousefi, Soheil Khorram, John H. L. Hansen

Recently proposed Permutation Invariant Training (PIT) addresses this problem by determining the output-label assignment which minimizes the separation error.

Speech Separation

Convolutional Neural Network-based Speech Enhancement for Cochlear Implant Recipients

no code implementations3 Jul 2019 Nursadul Mamun, Soheil Khorram, John H. L. Hansen

To improve speech enhancement methods for CI users, we propose to perform speech enhancement in a cochlear filter-bank feature space, a feature-set specifically designed for CI users based on CI auditory stimuli.

Speech Enhancement

Exploring OpenStreetMap Availability for Driving Environment Understanding

no code implementations11 Mar 2019 Yang Zheng, Izzat H. Izzat, John H. L. Hansen

An intelligent vehicle should be able to understand the driver's perception of the environment as well as controlling behavior of the vehicle.

Autonomous Driving Semantic Segmentation

UTD-CRSS Systems for 2016 NIST Speaker Recognition Evaluation

no code implementations24 Oct 2016 Chunlei Zhang, Fahimeh Bahmaninezhad, Shivesh Ranjan, Chengzhu Yu, Navid Shokouhi, John H. L. Hansen

This document briefly describes the systems submitted by the Center for Robust Speech Systems (CRSS) from The University of Texas at Dallas (UTD) to the 2016 National Institute of Standards and Technology (NIST) Speaker Recognition Evaluation (SRE).

Dimensionality Reduction Speaker Recognition

KU-ISPL Language Recognition System for NIST 2015 i-Vector Machine Learning Challenge

no code implementations21 Sep 2016 Suwon Shon, Seongkyu Mun, John H. L. Hansen, Hanseok Ko

The experimental results show that the use of duration and score fusion improves language recognition performance by 5% relative in LRiMLC15 cost.

Cannot find the paper you are looking for? You can Submit a new open access paper.