no code implementations • 18 Jun 2024 • Dongwon Jo, Taesu Kim, Yulhwa Kim, Jae-Joon Kim
Binarization, which converts weight parameters to binary values, has emerged as an effective strategy to reduce the size of large language models (LLMs).
1 code implementation • 14 Feb 2024 • Jiwon Song, Kyungseok Oh, Taesu Kim, HyungJun Kim, Yulhwa Kim, Jae-Joon Kim
In this paper, we introduce SLEB, a novel approach designed to streamline LLMs by eliminating redundant transformer blocks.
no code implementations • 7 Feb 2024 • Hyesung Jeon, Yulhwa Kim, Jae-Joon Kim
This has led to active research on quantization-aware PEFT techniques, which aim to create models with high accuracy and low memory overhead.
no code implementations • 3 Jul 2023 • Jiwoong Choi, Minkyu Kim, Daehyun Ahn, Taesu Kim, Yulhwa Kim, Dongwon Jo, Hyesung Jeon, Jae-Joon Kim, HyungJun Kim
The emergence of diffusion models has greatly broadened the scope of high-fidelity image synthesis, resulting in notable advancements in both practical implementation and academic research.
no code implementations • ICCV 2023 • Changhun Lee, HyungJun Kim, Eunhyeok Park, Jae-Joon Kim
Binary Neural Networks (BNNs) have emerged as a promising solution for reducing the memory footprint and compute costs of deep neural networks, but they suffer from quality degradation due to the lack of freedom as activations and weights are constrained to the binary values.
no code implementations • CVPR 2021 • HyungJun Kim, Jihoon Park, Changhun Lee, Jae-Joon Kim
We also show that adjusting the threshold values of binary activation functions results in the unbalanced distribution of the binary activation, which increases the accuracy of BNN models.
1 code implementation • NeurIPS 2020 • Jinseok Kim, Kyung-Su Kim, Jae-Joon Kim
For the gradient computation across the time domain in Spiking Neural Networks (SNNs) training, two different approaches have been independently studied.
1 code implementation • ICLR 2020 • Hyungjun Kim, Kyung-Su Kim, Jinseok Kim, Jae-Joon Kim
Binary Neural Networks (BNNs) have been garnering interest thanks to their compute cost reduction and memory savings.
no code implementations • 24 Jul 2019 • Hyungjun Kim, Malte Rasch, Tayfun Gokmen, Takashi Ando, Hiroyuki Miyazoe, Jae-Joon Kim, John Rozen, Seyoung Kim
By using this zero-shifting method, we show that network performance dramatically improves for imbalanced synapse devices.
no code implementations • ICLR 2019 • Daehyun Ahn, Dongsoo Lee, Taesu Kim, Jae-Joon Kim
In this paper, we propose a new sparse matrix format in order to enable a highly parallel decoding process of the entire sparse matrix.
no code implementations • 23 Mar 2019 • Hyungjun Kim, Yulhwa Kim, Sungju Ryu, Jae-Joon Kim
We demonstrate that the BitSplit version of LeNet-5, VGG-9, AlexNet, and ResNet-18 can be trained to have similar classification accuracy at a lower computational cost compared to conventional multi-bit networks with low bit precision (<= 4-bit).
1 code implementation • 6 Nov 2018 • Yulhwa Kim, HyungJun Kim, Jae-Joon Kim
Recently, RRAM-based Binary Neural Network (BNN) hardware has been gaining interests as it requires 1-bit sense-amp only and eliminates the need for high-resolution ADC and DAC.
no code implementations • ICLR 2018 • Dongsoo Lee, Daehyun Ahn, Taesu Kim, Pierce I. Chuang, Jae-Joon Kim
Hence, pruning is usually restricted to inference with a batch size of one, for which an efficient parallel matrix-vector multiplication method exists.
no code implementations • 30 Mar 2017 • Hyungjun Kim, Taesu Kim, Jinseok Kim, Jae-Joon Kim
Artificial Neural Network computation relies on intensive vector-matrix multiplications.