Search Results for author: Nam Soo Kim

Found 41 papers, 15 papers with code

HashCount at SemEval-2018 Task 3: Concatenative Featurization of Tweet and Hashtags for Irony Detection

no code implementations • SEMEVAL 2018 • Won Ik Cho, Woo Hyun Kang, Nam Soo Kim

This paper proposes a novel feature extraction process for SemEval task 3: Irony detection in English tweets.

Feature Engineering Hate Speech Detection +2

Paper
Add Code

Extracting Arguments from Korean Question and Command: An Annotated Corpus for Structured Paraphrasing

1 code implementation • 10 Oct 2018 • Won Ik Cho, Young Ki Moon, Woo Hyun Kang, Nam Soo Kim

Intention identification is a core issue in dialog management.

Argument Mining Management +2

Paper
Code

Giving Space to Your Message: Assistive Word Segmentation for the Electronic Typing of Digital Minorities

1 code implementation • 31 Oct 2018 • Won Ik Cho, Sung Jun Cheon, Woo Hyun Kang, Ji Won Kim, Nam Soo Kim

For readability and disambiguation of the written text, appropriate word segmentation is recommended for documentation, and it also holds for the digitized texts.

Segmentation

Paper
Code

Speech Intention Understanding in a Head-final Language: A Disambiguation Utilizing Intonation-dependency

2 code implementations • 10 Nov 2018 • Won Ik Cho, Hyeon Seung Lee, Ji Won Yoon, Seok Min Kim, Nam Soo Kim

This paper suggests a system which identifies the inherent intention of a spoken utterance given its transcript, in some cases using auxiliary acoustic features.

Sentence

Paper
Code

On Measuring Gender Bias in Translation of Gender-neutral Pronouns

1 code implementation • WS 2019 • Won Ik Cho, Ji Won Kim, Seok Min Kim, Nam Soo Kim

However, detection and evaluation of gender bias in the machine translation systems are not yet thoroughly investigated, for the task being cross-lingual and challenging to define.

Ethics Image Captioning +3

Paper
Code

Investigating an Effective Character-level Embedding in Korean Sentence Classification

2 code implementations • 31 May 2019 • Won Ik Cho, Seok Min Kim, Nam Soo Kim

Different from the writing systems of many Romance and Germanic languages, some languages or language families show complex conjunct forms in character composition.

Classification General Classification +3

Paper
Code

Text Matters but Speech Influences: A Computational Analysis of Syntactic Ambiguity Resolution

1 code implementation • 21 Oct 2019 • Won Ik Cho, Jeonghwa Cho, Woo Hyun Kang, Nam Soo Kim

Analyzing how human beings resolve syntactic ambiguity has long been an issue of interest in the field of linguistics.

Sentence Spoken Language Understanding

Paper
Code

Machines Getting with the Program: Understanding Intent Arguments of Non-Canonical Directives

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Won Ik Cho, Young Ki Moon, Sangwhan Moon, Seok Min Kim, Nam Soo Kim

Modern dialog managers face the challenge of having to fulfill human-level conversational skills as part of common user expectations, including but not limited to discourse with no clear objective.

Paper
Code

Discourse Component to Sentence (DC2S): An Efficient Human-Aided Construction of Paraphrase and Sentence Similarity Dataset

no code implementations • LREC 2020 • Won Ik Cho, Jong In Kim, Young Ki Moon, Nam Soo Kim

Assessing the similarity of sentences and detecting paraphrases is an essential task both in theory and practice, but achieving a reliable dataset requires high resource.

Natural Language Inference Paraphrase Generation +2

Paper
Add Code

Towards an Efficient Code-Mixed Grapheme-to-Phoneme Conversion in an Agglutinative Language: A Case Study on To-Korean Transliteration

no code implementations • LREC 2020 • Won Ik Cho, Seok Min Kim, Nam Soo Kim

Code-mixed grapheme-to-phoneme (G2P) conversion is a crucial issue for modern speech recognition and synthesis task, but has been seldom investigated in sentence-level in literature.

Philosophy Sentence +3

Paper
Add Code

Speech to Text Adaptation: Towards an Efficient Cross-Modal Distillation

no code implementations • 17 May 2020 • Won Ik Cho, Dong-Hyun Kwak, Ji Won Yoon, Nam Soo Kim

We transfer the knowledge from a concrete Transformer-based text LM to an SLU module which can face a data shortage, based on recent cross-modal distillation methodologies.

Computational Efficiency speech-recognition +2

Paper
Add Code

WaveNODE: A Continuous Normalizing Flow for Speech Synthesis

1 code implementation • 8 Jun 2020 • Hyeongju Kim, Hyeonseung Lee, Woo Hyun Kang, Sung Jun Cheon, Byoung Jin Choi, Nam Soo Kim

In recent years, various flow-based generative models have been proposed to generate high-fidelity waveforms in real-time.

Speech Synthesis

Paper
Code

SoftFlow: Probabilistic Framework for Normalizing Flow on Manifolds

1 code implementation • NeurIPS 2020 • Hyeongju Kim, Hyeonseung Lee, Woo Hyun Kang, Joun Yeop Lee, Nam Soo Kim

Flow-based generative models are composed of invertible transformations between two random variables of the same dimension.

Ranked #3 on Point Cloud Generation on ShapeNet Airplane

Point Cloud Generation

Paper
Code

TutorNet: Towards Flexible Knowledge Distillation for End-to-End Speech Recognition

no code implementations • 3 Aug 2020 • Ji Won Yoon, Hyeonseung Lee, Hyung Yong Kim, Won Ik Cho, Nam Soo Kim

To reduce this computational burden, knowledge distillation (KD), which is a popular model compression method, has been used to transfer knowledge from a deep and complex model (teacher) to a shallower and simpler model (student).

Knowledge Distillation Model Compression +3

Paper
Add Code

Robust Text-Dependent Speaker Verification via Character-Level Information Preservation for the SdSV Challenge 2020

no code implementations • 22 Oct 2020 • Sung Hwan Mun, Woo Hyun Kang, Min Hyun Han, Nam Soo Kim

This paper describes our submission to Task 1 of the Short-duration Speaker Verification (SdSV) challenge 2020.

Audio and Speech Processing Sound

Paper
Add Code

Unsupervised Representation Learning for Speaker Recognition via Contrastive Equilibrium Learning

1 code implementation • 22 Oct 2020 • Sung Hwan Mun, Woo Hyun Kang, Min Hyun Han, Nam Soo Kim

In this paper, we propose a simple but powerful unsupervised learning method for speaker recognition, namely Contrastive Equilibrium Learning (CEL), which increases the uncertainty on nuisance factors latent in the embeddings by employing the uniformity loss.

Representation Learning Speaker Recognition +1

Paper
Code

Continuous Monitoring of Blood Pressure with Evidential Regression

no code implementations • 6 Feb 2021 • Hyeongju Kim, Woo Hyun Kang, Hyeonseung Lee, Nam Soo Kim

Photoplethysmogram (PPG) signal-based blood pressure (BP) estimation is a promising candidate for modern BP measurements, as PPG signals can be easily obtained from wearable devices in a non-invasive manner, allowing quick BP measurement.

regression

Paper
Add Code

StyleKQC: A Style-Variant Paraphrase Corpus for Korean Questions and Commands

1 code implementation • LREC 2022 • Won Ik Cho, Sangwhan Moon, Jong In Kim, Seok Min Kim, Nam Soo Kim

Paraphrasing is often performed with less concern for controlled style conversion.

Natural Language Queries

Paper
Code

Expressive Text-to-Speech using Style Tag

no code implementations • 1 Apr 2021 • Minchan Kim, Sung Jun Cheon, Byoung Jin Choi, Jong Jin Kim, Nam Soo Kim

In this work, we propose StyleTagging-TTS (ST-TTS), a novel expressive TTS model that utilizes a style tag written in natural language.

Language Modelling TAG

Paper
Add Code

Diff-TTS: A Denoising Diffusion Model for Text-to-Speech

1 code implementation • 3 Apr 2021 • Myeonghun Jeong, Hyeongju Kim, Sung Jun Cheon, Byoung Jin Choi, Nam Soo Kim

Although neural text-to-speech (TTS) models have attracted a lot of attention and succeeded in generating human-like speech, there is still room for improvements to its naturalness and architectural efficiency.

Denoising Speech Synthesis

224

Paper
Code

Kosp2e: Korean Speech to English Translation Corpus

1 code implementation • 6 Jul 2021 • Won Ik Cho, Seok Min Kim, Hyunchang Cho, Nam Soo Kim

Most speech-to-text (S2T) translation studies use English speech as a source, which makes it difficult for non-English speakers to take advantage of the S2T technologies.

speech-recognition Speech Recognition +1

Paper
Code

Oracle Teacher: Leveraging Target Information for Better Knowledge Distillation of CTC Models

no code implementations • 5 Nov 2021 • Ji Won Yoon, Hyung Yong Kim, Hyeonseung Lee, Sunghwan Ahn, Nam Soo Kim

Extending this supervised scheme further, we introduce a new type of teacher model for connectionist temporal classification (CTC)-based sequence models, namely Oracle Teacher, that leverages both the source inputs and the output labels as the teacher model's input.

Knowledge Distillation Machine Translation +5

Paper
Add Code

Bootstrap Equilibrium and Probabilistic Speaker Representation Learning for Self-supervised Speaker Verification

no code implementations • 16 Dec 2021 • Sung Hwan Mun, Min Hyun Han, Dongjune Lee, JiHwan Kim, Nam Soo Kim

In this paper, we propose self-supervised speaker representation learning strategies, which comprise of a bootstrap equilibrium speaker representation learning in the front-end and an uncertainty-aware probabilistic speaker embedding training in the back-end.

Contrastive Learning Representation Learning +1

Paper
Add Code

Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus

no code implementations • 29 Mar 2022 • Minchan Kim, Myeonghun Jeong, Byoung Jin Choi, Sunghwan Ahn, Joun Yeop Lee, Nam Soo Kim

The experimental results verify the effectiveness of the proposed method in terms of naturalness, intelligibility, and speaker generalization.

Transfer Learning Zero-Shot Multi-Speaker TTS

Paper
Add Code

Frequency and Multi-Scale Selective Kernel Attention for Speaker Verification

1 code implementation • 3 Apr 2022 • Sung Hwan Mun, Jee-weon Jung, Min Hyun Han, Nam Soo Kim

The SKA mechanism allows each convolutional layer to adaptively select the kernel size in a data-driven fashion.

Speaker Verification

Paper
Code

HuBERT-EE: Early Exiting HuBERT for Efficient Speech Recognition

no code implementations • 13 Apr 2022 • Ji Won Yoon, Beom Jun Woo, Nam Soo Kim

Pre-training with self-supervised models, such as Hidden-unit BERT (HuBERT) and wav2vec 2. 0, has brought significant improvements in automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Disentangled Speaker Representation Learning via Mutual Information Minimization

no code implementations • 17 Aug 2022 • Sung Hwan Mun, Min Hyun Han, Minchan Kim, Dongjune Lee, Nam Soo Kim

The experimental results show that fine-tuning with a disentanglement framework on a existing pre-trained model is valid and can further improve performance.

Disentanglement Speaker Recognition +2

Paper
Add Code

Fully Unsupervised Training of Few-shot Keyword Spotting

no code implementations • 6 Oct 2022 • Dongjune Lee, Minchan Kim, Sung Hwan Mun, Min Hyun Han, Nam Soo Kim

For training a few-shot keyword spotting (FS-KWS) model, a large labeled dataset containing massive target keywords has known to be essential to generalize to arbitrary target keywords with only a few enrollment samples.

Keyword Spotting Metric Learning +1

Paper
Add Code

Adversarial Speaker-Consistency Learning Using Untranscribed Speech Data for Zero-Shot Multi-Speaker Text-to-Speech

no code implementations • 12 Oct 2022 • Byoung Jin Choi, Myeonghun Jeong, Minchan Kim, Sung Hwan Mun, Nam Soo Kim

Several recently proposed text-to-speech (TTS) models achieved to generate the speech samples with the human-level quality in the single-speaker and multi-speaker TTS scenarios with a set of pre-defined speakers.

Paper
Add Code

Inter-KD: Intermediate Knowledge Distillation for CTC-Based Automatic Speech Recognition

no code implementations • 28 Nov 2022 • Ji Won Yoon, Beom Jun Woo, Sunghwan Ahn, Hyeonseung Lee, Nam Soo Kim

Recently, the advance in deep learning has brought a considerable improvement in the end-to-end speech recognition field, simplifying the traditional pipeline while producing promising results.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech

no code implementations • 30 Nov 2022 • Byoung Jin Choi, Myeonghun Jeong, Joun Yeop Lee, Nam Soo Kim

Zero-shot multi-speaker text-to-speech (ZSM-TTS) models aim to generate a speech sample with the voice characteristic of an unseen speaker.

Speech Synthesis

Paper
Add Code

When Crowd Meets Persona: Creating a Large-Scale Open-Domain Persona Dialogue Corpus

no code implementations • 1 Apr 2023 • Won Ik Cho, Yoon Kyung Lee, Seoyeon Bae, JiHwan Kim, Sangah Park, Moosung Kim, Sowon Hahn, Nam Soo Kim

Building a natural language dataset requires caution since word semantics is vulnerable to subtle text change or the definition of the annotated concept.

Dialogue Generation Question Answering +2

Paper
Add Code

Towards single integrated spoofing-aware speaker verification embeddings

1 code implementation • 30 May 2023 • Sung Hwan Mun, Hye-jin Shim, Hemlata Tak, Xin Wang, Xuechen Liu, Md Sahidullah, Myeonghun Jeong, Min Hyun Han, Massimiliano Todisco, Kong Aik Lee, Junichi Yamagishi, Nicholas Evans, Tomi Kinnunen, Nam Soo Kim, Jee-weon Jung

Second, competitive performance should be demonstrated compared to the fusion of automatic speaker verification (ASV) and countermeasure (CM) embeddings, which outperformed single embedding solutions by a large margin in the SASV2022 challenge.

Speaker Verification

Paper
Code

MCR-Data2vec 2.0: Improving Self-supervised Speech Pre-training via Model-level Consistency Regularization

no code implementations • 14 Jun 2023 • Ji Won Yoon, Seok Min Kim, Nam Soo Kim

Self-supervised learning (SSL) has shown significant progress in speech processing tasks.

Self-Supervised Learning

Paper
Add Code

EM-Network: Oracle Guided Self-distillation for Sequence Learning

no code implementations • 14 Jun 2023 • Ji Won Yoon, Sunghwan Ahn, Hyeonseung Lee, Minchan Kim, Seok Min Kim, Nam Soo Kim

We introduce EM-Network, a novel self-distillation approach that effectively leverages target information for supervised sequence-to-sequence (seq2seq) learning.

Machine Translation speech-recognition +1

Paper
Add Code

Transduce and Speak: Neural Transducer for Text-to-Speech with Semantic Token Prediction

no code implementations • 6 Nov 2023 • Minchan Kim, Myeonghun Jeong, Byoung Jin Choi, Dongjune Lee, Nam Soo Kim

We introduce a text-to-speech(TTS) framework based on a neural transducer.

Paper
Add Code

EEND-DEMUX: End-to-End Neural Speaker Diarization via Demultiplexed Speaker Embeddings

no code implementations • 11 Dec 2023 • Sung Hwan Mun, Min Hyun Han, Canyeong Moon, Nam Soo Kim

In recent years, there have been studies to further improve the end-to-end neural speaker diarization (EEND) systems.

speaker-diarization Speaker Diarization

Paper
Add Code

Efficient Parallel Audio Generation using Group Masked Language Modeling

no code implementations • 2 Jan 2024 • Myeonghun Jeong, Minchan Kim, Joun Yeop Lee, Nam Soo Kim

We present a fast and high-quality codec language model for parallel audio generation.

Audio Generation Computational Efficiency +2

Paper
Add Code

Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction

no code implementations • 3 Jan 2024 • Minchan Kim, Myeonghun Jeong, Byoung Jin Choi, Semin Kim, Joun Yeop Lee, Nam Soo Kim

We also delve into the inference speed and prosody control capabilities of our approach, highlighting the potential of neural transducers in TTS frameworks.

Paper
Add Code

OpenKorPOS: Democratizing Korean Tokenization with Voting-Based Open Corpus Annotation

no code implementations • LREC 2022 • Sangwhan Moon, Won Ik Cho, Hye Joo Han, Naoaki Okazaki, Nam Soo Kim

As this problem originates from the conventional scheme used when creating a POS tagging corpus, we propose an improvement to the existing scheme, which makes it friendlier to generative tasks.

POS POS Tagging +1

Paper
Add Code

Pay Attention to Categories: Syntax-Based Sentence Modeling with Metadata Projection Matrix

no code implementations • PACLIC 2020 • Won Ik Cho, Nam Soo Kim

Sentence

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.