Search Results for author: Youngmoon Jung

Found 13 papers, 3 papers with code

FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning

1 code implementation • 1 Jul 2022 • Yeonghyeon Lee, Kangwook Jang, Jahyun Goo, Youngmoon Jung, Hoirin Kim

Our method reduces the model to 23. 8% in size and 35. 9% in inference time compared to HuBERT.

Knowledge Distillation Self-Supervised Learning

Paper
Code

Learning to Maximize Speech Quality Directly Using MOS Prediction for Neural Text-to-Speech

no code implementations • 2 Nov 2020 • Yeunju Choi, Youngmoon Jung, Youngjoo Suh, Hoirin Kim

Although recent neural text-to-speech (TTS) systems have achieved high-quality speech synthesis, there are cases where a TTS system generates low-quality speech, mainly caused by limited training data or information loss during knowledge distillation.

Knowledge Distillation Speech Synthesis +1

Paper
Add Code

A Unified Deep Learning Framework for Short-Duration Speaker Verification in Adverse Environments

no code implementations • 6 Oct 2020 • Youngmoon Jung, Yeunju Choi, Hyungjun Lim, Hoirin Kim

At the same time, there is an increasing requirement for an SV system: it should be robust to short speech segments, especially in noisy and reverberant environments.

Action Detection Activity Detection +2

Paper
Add Code

Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling

no code implementations • 9 Aug 2020 • Yeunju Choi, Youngmoon Jung, Hoirin Kim

While deep learning has made impressive progress in speech synthesis and voice conversion, the assessment of the synthesized speech is still carried out by human participants.

Speech Synthesis Voice Conversion

Paper
Add Code

Neural MOS Prediction for Synthesized Speech Using Multi-Task Learning With Spoofing Detection and Spoofing Type Classification

no code implementations • 16 Jul 2020 • Yeunju Choi, Youngmoon Jung, Hoirin Kim

In this paper, we propose a multi-task learning (MTL) method to improve the performance of a MOS prediction model using the following two auxiliary tasks: spoofing detection (SD) and spoofing type classification (STC).

Multi-Task Learning Voice Conversion

Paper
Add Code

Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification using CTC-based Soft VAD and Global Query Attention

no code implementations • 8 May 2020 • Myunghun Jung, Youngmoon Jung, Jahyun Goo, Hoirin Kim

Keyword spotting (KWS) and speaker verification (SV) have been studied independently although it is known that acoustic and speaker domains are complementary.

Action Detection Activity Detection +2

Paper
Add Code

Improving Multi-Scale Aggregation Using Feature Pyramid Module for Robust Speaker Verification of Variable-Duration Utterances

no code implementations • 7 Apr 2020 • Youngmoon Jung, Seong Min Kye, Yeunju Choi, Myunghun Jung, Hoirin Kim

In this approach, we obtain a speaker embedding vector by pooling single-scale features that are extracted from the last layer of a speaker feature extractor.

Text-Independent Speaker Verification

Paper
Add Code

Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs

1 code implementation • 6 Apr 2020 • Seong Min Kye, Youngmoon Jung, Hae Beom Lee, Sung Ju Hwang, Hoirin Kim

By combining these two learning schemes, our model outperforms existing state-of-the-art speaker verification models learned with a standard supervised learning framework on short utterance (1-2 seconds) on the VoxCeleb datasets.

Meta-Learning Speaker Identification +2

Paper
Code

Dual Attention in Time and Frequency Domain for Voice Activity Detection

1 code implementation • 27 Mar 2020 • Joohyung Lee, Youngmoon Jung, Hoirin Kim

The results show that the focal loss can improve the performance in various imbalance situations compared to the cross entropy loss, a commonly used loss function in VAD.

Action Detection Activity Detection

Paper
Code

Additional Shared Decoder on Siamese Multi-view Encoders for Learning Acoustic Word Embeddings

no code implementations • 1 Oct 2019 • Myunghun Jung, Hyungjun Lim, Jahyun Goo, Youngmoon Jung, Hoirin Kim

Acoustic word embeddings --- fixed-dimensional vector representations of arbitrary-length words --- have attracted increasing interest in query-by-example spoken term detection.

speech-recognition Speech Recognition +1

Paper
Add Code

Self-Adaptive Soft Voice Activity Detection using Deep Neural Networks for Robust Speaker Verification

no code implementations • 26 Sep 2019 • Youngmoon Jung, Yeunju Choi, Hoirin Kim

The first approach is soft VAD, which performs a soft selection of frame-level features extracted from a speaker feature extractor.

Action Detection Activity Detection +2

Paper
Add Code

Spatial Pyramid Encoding with Convex Length Normalization for Text-Independent Speaker Verification

no code implementations • 19 Jun 2019 • Youngmoon Jung, Younggwan Kim, Hyungjun Lim, Yeunju Choi, Hoirin Kim

Furthermore, we apply deep length normalization by augmenting the loss function with ring loss.

Text-Independent Speaker Verification

Paper
Add Code

Learning acoustic word embeddings with phonetically associated triplet network

no code implementations • 7 Nov 2018 • Hyungjun Lim, Younggwan Kim, Youngmoon Jung, Myunghun Jung, Hoirin Kim

Previous researches on acoustic word embeddings used in query-by-example spoken term detection have shown remarkable performance improvements when using a triplet network.

Word Embeddings

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.