Search Results for author: Hee-Soo Heo

Found 24 papers, 5 papers with code

Encoder-decoder multimodal speaker change detection

no code implementations1 Jun 2023 Jee-weon Jung, Soonshin Seo, Hee-Soo Heo, Geonmin Kim, You Jin Kim, Young-ki Kwon, Minjae Lee, Bong-Jin Lee

The task of speaker change detection (SCD), which detects points where speakers change in an input, is essential for several applications.

Automatic Speech Recognition Change Detection +3

Absolute decision corrupts absolutely: conservative online speaker diarisation

no code implementations9 Nov 2022 Youngki Kwon, Hee-Soo Heo, Bong-Jin Lee, You Jin Kim, Jee-weon Jung

Our focus lies in developing an online speaker diarisation framework which demonstrates robust performance across diverse domains.

In search of strong embedding extractors for speaker diarisation

no code implementations26 Oct 2022 Jee-weon Jung, Hee-Soo Heo, Bong-Jin Lee, Jaesung Huh, Andrew Brown, Youngki Kwon, Shinji Watanabe, Joon Son Chung

First, the evaluation is not straightforward because the features required for better performance differ between speaker verification and diarisation.

Data Augmentation Speaker Verification

SASV 2022: The First Spoofing-Aware Speaker Verification Challenge

no code implementations28 Mar 2022 Jee-weon Jung, Hemlata Tak, Hye-jin Shim, Hee-Soo Heo, Bong-Jin Lee, Soo-Whan Chung, Ha-Jin Yu, Nicholas Evans, Tomi Kinnunen

Pre-trained spoofing detection and speaker verification models are provided as open source and are used in two baseline SASV solutions.

Speaker Verification

Pushing the limits of raw waveform speaker recognition

2 code implementations16 Mar 2022 Jee-weon Jung, You Jin Kim, Hee-Soo Heo, Bong-Jin Lee, Youngki Kwon, Joon Son Chung

Our best model achieves an equal error rate of 0. 89%, which is competitive with the state-of-the-art models based on handcrafted features, and outperforms the best model based on raw waveform inputs by a large margin.

Self-Supervised Learning Speaker Recognition +1

Look Who's Talking: Active Speaker Detection in the Wild

1 code implementation17 Aug 2021 You Jin Kim, Hee-Soo Heo, Soyeon Choe, Soo-Whan Chung, Yoohwan Kwon, Bong-Jin Lee, Youngki Kwon, Joon Son Chung

Face tracks are extracted from the videos and active segments are annotated based on the timestamps of VoxConverse in a semi-automatic way.

Active Speaker Detection

Graph Attention Networks for Speaker Verification

no code implementations22 Oct 2020 Jee-weon Jung, Hee-Soo Heo, Ha-Jin Yu, Joon Son Chung

The proposed framework inputs segment-wise speaker embeddings from an enrollment and a test utterance and directly outputs a similarity score.

Graph Attention Speaker Verification

Self-supervised pre-training with acoustic configurations for replay spoofing detection

no code implementations22 Oct 2019 Hye-jin Shim, Hee-Soo Heo, Jee-weon Jung, Ha-Jin Yu

Constructing a dataset for replay spoofing detection requires a physical process of playing an utterance and re-recording it, presenting a challenge to the collection of large-scale datasets.

Speaker Verification

Cosine similarity-based adversarial process

no code implementations1 Jul 2019 Hee-Soo Heo, Jee-weon Jung, Hye-jin Shim, IL-Ho Yang, Ha-Jin Yu

In particular, the adversarial process degrades the performance of the subsidiary model by eliminating the subsidiary information in the input which, in assumption, may degrade the performance of the primary model.

Speaker Identification

Replay attack detection with complementary high-resolution information using end-to-end DNN for the ASVspoof 2019 Challenge

1 code implementation23 Apr 2019 Jee-weon Jung, Hye-jin Shim, Hee-Soo Heo, Ha-Jin Yu

To detect unrevealed characteristics that reside in a replayed speech, we directly input spectrograms into an end-to-end DNN without knowledge-based intervention.

RawNet: Advanced end-to-end deep neural network using raw waveforms for text-independent speaker verification

4 code implementations17 Apr 2019 Jee-weon Jung, Hee-Soo Heo, Ju-ho Kim, Hye-jin Shim, Ha-Jin Yu

In this study, we explore end-to-end deep neural networks that input raw waveforms to improve various aspects: front-end speaker embedding extraction including model architecture, pre-training scheme, additional objective functions, and back-end classification.

Classification Data Augmentation +2

Short utterance compensation in speaker verification via cosine-based teacher-student learning of speaker embeddings

no code implementations25 Oct 2018 Jee-weon Jung, Hee-Soo Heo, Hye-jin Shim, Ha-Jin Yu

The short duration of an input utterance is one of the most critical threats that degrade the performance of speaker verification systems.

Text-Independent Speaker Verification

Replay spoofing detection system for automatic speaker verification using multi-task learning of noise classes

no code implementations29 Aug 2018 Hye-jin Shim, Jee-weon Jung, Hee-Soo Heo, Sung-Hyun Yoon, Ha-Jin Yu

We explore the effectiveness of training a deep neural network simultaneously for replay attack spoofing detection and replay noise classification.

General Classification Multi-Task Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.