no code implementations • 31 Oct 2023 • Sungjin Cheong, Wonho Jung, Yoon Seop Lim, Yong-Hwa Park
However, a significant domain gap exists between synthetic TIR and real TIR images.
no code implementations • 20 Jun 2023 • Deokki Min, Hyeonuk Nam, Yong-Hwa Park
In this work, we utilized STRF as a kernel of the first convolutional layer in SED model to extract neural response from input sound to make SED model similar to human auditory system.
1 code implementation • 8 Jun 2023 • Junhyeok Lee, Hyeonuk Nam, Yong-Hwa Park
Different from TTS models which generate short pronunciation from phonemes and speaker identity, the category-to-sound problem requires generating diverse sounds just from a category index.
no code implementations • 28 Feb 2023 • Sung-Hyun Lee, Wook-Hyeon Kwon, Yoon-Seop Lim, Yong-Hwa Park
In this paper, an automatic calibration algorithm is proposed to reduce the depth error caused by internal stray light in amplitude-modulated continuous wave (AMCW) coaxial scanning light detection and ranging (LiDAR).
no code implementations • 24 Jun 2022 • Byeong-Yun Ko, Hyeonuk Nam, Seong-Hu Kim, Deokki Min, Seung-Deok Choi, Yong-Hwa Park
Performance of sound event localization and detection (SELD) in real scenes is limited by small size of SELD dataset, due to difficulty in obtaining sufficient amount of realistic multi-channel audio data recordings with accurate label.
1 code implementation • 23 Jun 2022 • Hyeonuk Nam, Seong-Hu Kim, Deokki Min, Byeong-Yun Ko, Seung-Deok Choi, Yong-Hwa Park
While many deep learning methods on other domains have been applied to sound event detection (SED), differences between original domains of the methods and SED have not been appropriately considered so far.
1 code implementation • 29 Mar 2022 • Hyeonuk Nam, Seong-Hu Kim, Byeong-Yun Ko, Yong-Hwa Park
2D convolution is widely used in sound event detection (SED) to recognize two dimensional time-frequency patterns of sound events.
Ranked #3 on Sound Event Detection on DESED
1 code implementation • 29 Mar 2022 • Seong-Hu Kim, Hyeonuk Nam, Yong-Hwa Park
To extract accurate speaker information for text-independent speaker verification, temporal dynamic CNNs (TDY-CNNs) adapting kernels to each time bin was proposed.
1 code implementation • 7 Mar 2022 • Gyeong-Tae Lee, Sang-Min Choi, Byeong-Yun Ko, Yong-Hwa Park
The origin transfer function at the head center was obtained by the proposed measurement scheme using a 0 degree on-axis microphone to ensure accurate spectral cue pattern of HRTFs, whereas in the previous measurements with a 90 degree off-axis microphone, the magnitude response of the origin transfer function fluctuated and decreased with increasing frequency, causing erroneous SCs of HRTFs.
no code implementations • 16 Dec 2021 • Sung-Hyun Lee, Wook-Hyeon Kwon, Yoon-Seop Lim, Yong-Hwa Park
In this paper, a novel amplitude-modulated continuous wave (AMCW) time-of-flight (ToF) scanning sensor based on digital-parallel demodulation is proposed and demonstrated in the aspect of distance measurement precision.
1 code implementation • 7 Oct 2021 • Seong-Hu Kim, Hyeonuk Nam, Yong-Hwa Park
The temporal dynamic model adapts itself to phonemes without explicitly given phoneme information during training, and results show the necessity to consider phoneme variation within utterances for more accurate and robust text-independent speaker verification.
1 code implementation • 7 Oct 2021 • Hyeonuk Nam, Seong-Hu Kim, Yong-Hwa Park
Thus, training acoustic models for audio and speech tasks requires regularization on various acoustic environments in order to achieve robust performance in real life applications.
no code implementations • 28 Jul 2021 • Gyeong-Tae Lee, Hyeonuk Nam, Seong-Hu Kim, Sang-Min Choi, Youngkey Kim, Yong-Hwa Park
Finally, a test F1 score of 91. 9% (test accuracy of 97. 2%) was achieved from G-net with the MFCC-V-A feature (named Spectroflow), an acoustic feature effective for use in cough detection.
1 code implementation • 8 Jul 2021 • Hyeonuk Nam, Byeong-Yun Ko, Gyeong-Tae Lee, Seong-Hu Kim, Won-Ho Jung, Sang-Min Choi, Yong-Hwa Park
In this work, we used two main approaches to overcome the lack of strongly labeled data.
Ranked #6 on Sound Event Detection on DESED