The main objective of the spoofing countermeasure system is to detect the artifacts within the input speech caused by the speech synthesis or voice conversion process.
Over the recent years, various deep learning-based methods were proposed for extracting a fixed-dimensional embedding vector from speech signals.
Photoplethysmogram (PPG) signal-based blood pressure (BP) estimation is a promising candidate for modern BP measurements, as PPG signals can be easily obtained from wearable devices in a non-invasive manner, allowing quick BP measurement.
This paper describes our submission to Task 1 of the Short-duration Speaker Verification (SdSV) challenge 2020.
Audio and Speech Processing Sound
In this paper, we propose a simple but powerful unsupervised learning method for speaker recognition, namely Contrastive Equilibrium Learning (CEL), which increases the uncertainty on nuisance factors latent in the embeddings by employing the uniformity loss.
Flow-based generative models are composed of invertible transformations between two random variables of the same dimension.
Ranked #1 on Point Cloud Generation on ShapeNet Airplane
In recent years, various flow-based generative models have been proposed to generate high-fidelity waveforms in real-time.
Analyzing how human beings resolve syntactic ambiguity has long been an issue of interest in the field of linguistics.
For readability and disambiguation of the written text, appropriate word segmentation is recommended for documentation, and it also holds for the digitized texts.
Intention identification is a core issue in dialog management.
This paper proposes a novel feature extraction process for SemEval task 3: Irony detection in English tweets.