no code implementations • 12 Dec 2023 • Jiyoung Kim, Kyuhong Shim, Insu Lee, Byonghyo Shim
In this paper, we propose a novel USS framework called Expand-and-Quantize Unsupervised Semantic Segmentation (EQUSS), which combines the benefits of high-dimensional spaces for better clustering and product quantization for effective information compression.
Ranked #3 on Unsupervised Semantic Segmentation on Potsdam-3
no code implementations • 31 Aug 2023 • Kyuhong Shim, Jinkyu Lee, Simyung Chang, Kyuwoong Hwang
Streaming automatic speech recognition (ASR) models are restricted from accessing future context, which results in worse performance compared to the non-streaming models.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 31 Aug 2023 • Seunghan Yang, Byeonggeun Kim, Kyuhong Shim, Simyung Chang
Few-shot keyword spotting (FS-KWS) models usually require large-scale annotated datasets to generalize to unseen target keywords.
no code implementations • 25 Apr 2023 • Kyuhong Shim, Jiyoung Kim, Gusang Lee, Byonghyo Shim
Monocular depth estimation is very challenging because clues to the exact depth are incomplete in a single RGB image.
no code implementations • 10 Mar 2023 • Sunwoo Kim, Kyuhong Shim, Luong Trung Nguyen, Byonghyo Shim
Image text retrieval is a task to search for the proper textual descriptions of the visual world and vice versa.
1 code implementation • 23 Feb 2023 • Minsoo Kim, Kyuhong Shim, Seongmin Park, Wonyong Sung, Jungwook Choi
Pre-trained Transformer models such as BERT have shown great success in a wide range of applications, but at the cost of substantial increases in model complexity.
no code implementations • 2 Feb 2023 • Jiseob Kim, Kyuhong Shim, Junhan Kim, Byonghyo Shim
In AAM, the correlation between each patch feature and the synthetic image attribute is used as the importance weight for each patch.
no code implementations • 29 Jan 2023 • Kyuhong Shim, Jungwook Choi, Wonyong Sung
In this paper, we provide a comprehensive study on attention map reuse focusing on its ability to accelerate inference.
no code implementations • 1 Oct 2022 • Kyuhong Shim, Wonyong Sung
Our analyses show that Transformer and Conformer models benefit from the long-range accessibility of self-attention through input frames.
no code implementations • 6 Sep 2022 • Yongjun Ahn, jinhong Kim, Seungnyun Kim, Kyuhong Shim, Jiyoung Kim, Sangtae Kim, Byonghyo Shim
Beamforming technique realized by the multiple-input-multiple-output (MIMO) antenna arrays has been widely used to compensate for the severe path loss in the millimeter wave (mmWave) bands.
no code implementations • 19 Mar 2022 • Kyuhong Shim, Wonyong Sung
Especially, SA heads in lower layers capture various phonetic characteristics by the query-key dot product, which is designed to compute the pairwise relationship between frames.
no code implementations • 22 Feb 2022 • Kyuhong Shim, Hyewon Bae, Wonyong Sung
Although the common approach is to use the same tokenization method for external LM as the ASR model, we show that it may not be the best choice for Korean.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 29 Dec 2021 • Junhan Kim, Kyuhong Shim, Byonghyo Shim
Key idea of the proposed approach, henceforth referred to as semantic feature extraction-based GZSL (SE-GZSL), is to use the semantic feature containing only attribute-related information in learning the relationship between the image and the attribute.
1 code implementation • 2021 IEEE Workshop on Signal Processing Systems (SiPS) 2021 • Seokhyeon Choi, Kyuhong Shim, Jungwook Choi, Wonyong Sung, Byonghyo Shim
We propose TernGEMM, a special GEMM library using SIMD instructions for Deep Neural Network (DNN) inference with ternary weights and activations under 8-bit.
1 code implementation • 2021 18th International SoC Design Conference (ISOCC) 2021 • Kyuhong Shim, Iksoo Choi, Wonyong Sung, Jungwook Choi
While Transformer-based models have shown impressive language modeling performance, the large computation cost is often prohibitive for practical use.
no code implementations • ICLR 2022 • Kyuhong Shim, Jungwook Choi, Wonyong Sung
Self-attention (SA) is a critical component of Transformer neural networks that have succeeded in automatic speech recognition (ASR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • NeurIPS 2017 • Kyuhong Shim, Minjae Lee, Iksoo Choi, Yoonho Boo, Wonyong Sung
The approximate probability of each word can be estimated with only a small part of the weight matrix by using a few large singular values and the corresponding elements for most of the words.