Search Results for author: Jae-Joon Kim

Found 14 papers, 4 papers with code

Mixture of Scales: Memory-Efficient Token-Adaptive Binarization for Large Language Models

no code implementations18 Jun 2024 Dongwon Jo, Taesu Kim, Yulhwa Kim, Jae-Joon Kim

Binarization, which converts weight parameters to binary values, has emerged as an effective strategy to reduce the size of large language models (LLMs).

Binarization Quantization

SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks

1 code implementation14 Feb 2024 Jiwon Song, Kyungseok Oh, Taesu Kim, HyungJun Kim, Yulhwa Kim, Jae-Joon Kim

In this paper, we introduce SLEB, a novel approach designed to streamline LLMs by eliminating redundant transformer blocks.

L4Q: Parameter Efficient Quantization-Aware Fine-Tuning on Large Language Models

no code implementations7 Feb 2024 Hyesung Jeon, Yulhwa Kim, Jae-Joon Kim

This has led to active research on quantization-aware PEFT techniques, which aim to create models with high accuracy and low memory overhead.

In-Context Learning Model Compression +2

Squeezing Large-Scale Diffusion Models for Mobile

no code implementations3 Jul 2023 Jiwoong Choi, Minkyu Kim, Daehyun Ahn, Taesu Kim, Yulhwa Kim, Dongwon Jo, Hyesung Jeon, Jae-Joon Kim, HyungJun Kim

The emergence of diffusion models has greatly broadened the scope of high-fidelity image synthesis, resulting in notable advancements in both practical implementation and academic research.

Image Generation

INSTA-BNN: Binary Neural Network with INSTAnce-aware Threshold

no code implementations ICCV 2023 Changhun Lee, HyungJun Kim, Eunhyeok Park, Jae-Joon Kim

Binary Neural Networks (BNNs) have emerged as a promising solution for reducing the memory footprint and compute costs of deep neural networks, but they suffer from quality degradation due to the lack of freedom as activations and weights are constrained to the binary values.

Quantization

Improving Accuracy of Binary Neural Networks using Unbalanced Activation Distribution

no code implementations CVPR 2021 HyungJun Kim, Jihoon Park, Changhun Lee, Jae-Joon Kim

We also show that adjusting the threshold values of binary activation functions results in the unbalanced distribution of the binary activation, which increases the accuracy of BNN models.

Binarization

Unifying Activation- and Timing-based Learning Rules for Spiking Neural Networks

1 code implementation NeurIPS 2020 Jinseok Kim, Kyung-Su Kim, Jae-Joon Kim

For the gradient computation across the time domain in Spiking Neural Networks (SNNs) training, two different approaches have been independently studied.

BinaryDuo: Reducing Gradient Mismatch in Binary Activation Network by Coupling Binary Activations

1 code implementation ICLR 2020 Hyungjun Kim, Kyung-Su Kim, Jinseok Kim, Jae-Joon Kim

Binary Neural Networks (BNNs) have been garnering interest thanks to their compute cost reduction and memory savings.

Zero-shifting Technique for Deep Neural Network Training on Resistive Cross-point Arrays

no code implementations24 Jul 2019 Hyungjun Kim, Malte Rasch, Tayfun Gokmen, Takashi Ando, Hiroyuki Miyazoe, Jae-Joon Kim, John Rozen, Seyoung Kim

By using this zero-shifting method, we show that network performance dramatically improves for imbalanced synapse devices.

BitSplit-Net: Multi-bit Deep Neural Network with Bitwise Activation Function

no code implementations23 Mar 2019 Hyungjun Kim, Yulhwa Kim, Sungju Ryu, Jae-Joon Kim

We demonstrate that the BitSplit version of LeNet-5, VGG-9, AlexNet, and ResNet-18 can be trained to have similar classification accuracy at a lower computational cost compared to conventional multi-bit networks with low bit precision (<= 4-bit).

Neural Network-Hardware Co-design for Scalable RRAM-based BNN Accelerators

1 code implementation6 Nov 2018 Yulhwa Kim, HyungJun Kim, Jae-Joon Kim

Recently, RRAM-based Binary Neural Network (BNN) hardware has been gaining interests as it requires 1-bit sense-amp only and eliminates the need for high-resolution ADC and DAC.

Neural Network simulation

Viterbi-based Pruning for Sparse Matrix with Fixed and High Index Compression Ratio

no code implementations ICLR 2018 Dongsoo Lee, Daehyun Ahn, Taesu Kim, Pierce I. Chuang, Jae-Joon Kim

Hence, pruning is usually restricted to inference with a batch size of one, for which an efficient parallel matrix-vector multiplication method exists.

Deep Neural Network Optimized to Resistive Memory with Nonlinear Current-Voltage Characteristics

no code implementations30 Mar 2017 Hyungjun Kim, Taesu Kim, Jinseok Kim, Jae-Joon Kim

Artificial Neural Network computation relies on intensive vector-matrix multiplications.

Cannot find the paper you are looking for? You can Submit a new open access paper.