Search Results for author: HyungJun Kim

Found 14 papers, 5 papers with code

QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference

1 code implementation15 Feb 2024 Taesu Kim, Jongho Lee, Daehyun Ahn, Sarang Kim, Jiwoong Choi, Minkyu Kim, HyungJun Kim

We introduce QUICK, a group of novel optimized CUDA kernels for the efficient inference of quantized Large Language Models (LLMs).

Quantization

SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks

1 code implementation14 Feb 2024 Jiwon Song, Kyungseok Oh, Taesu Kim, HyungJun Kim, Yulhwa Kim, Jae-Joon Kim

In this paper, we introduce SLEB, a novel approach designed to streamline LLMs by eliminating redundant transformer blocks.

Squeezing Large-Scale Diffusion Models for Mobile

no code implementations3 Jul 2023 Jiwoong Choi, Minkyu Kim, Daehyun Ahn, Taesu Kim, Yulhwa Kim, Dongwon Jo, Hyesung Jeon, Jae-Joon Kim, HyungJun Kim

The emergence of diffusion models has greatly broadened the scope of high-fidelity image synthesis, resulting in notable advancements in both practical implementation and academic research.

Image Generation

OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models

2 code implementations4 Jun 2023 Changhun Lee, Jungyu Jin, Taesu Kim, HyungJun Kim, Eunhyeok Park

Large language models (LLMs) with hundreds of billions of parameters require powerful server-grade GPUs for inference, limiting their practical deployment.

Quantization

Temporal Dynamic Quantization for Diffusion Models

no code implementations NeurIPS 2023 Junhyuk So, Jungwon Lee, Daehyun Ahn, HyungJun Kim, Eunhyeok Park

The diffusion model has gained popularity in vision applications due to its remarkable generative performance and versatility.

Quantization

INSTA-BNN: Binary Neural Network with INSTAnce-aware Threshold

no code implementations ICCV 2023 Changhun Lee, HyungJun Kim, Eunhyeok Park, Jae-Joon Kim

Binary Neural Networks (BNNs) have emerged as a promising solution for reducing the memory footprint and compute costs of deep neural networks, but they suffer from quality degradation due to the lack of freedom as activations and weights are constrained to the binary values.

Quantization

Improving Accuracy of Binary Neural Networks using Unbalanced Activation Distribution

no code implementations CVPR 2021 HyungJun Kim, Jihoon Park, Changhun Lee, Jae-Joon Kim

We also show that adjusting the threshold values of binary activation functions results in the unbalanced distribution of the binary activation, which increases the accuracy of BNN models.

Binarization

SUMBT+LaRL: Effective Multi-domain End-to-end Neural Task-oriented Dialog System

no code implementations22 Sep 2020 Hwaran Lee, Seokhwan Jo, HyungJun Kim, SangKeun Jung, Tae-Yoon Kim

To our best knowledge, our work is the first comprehensive study of a modularized E2E multi-domain dialog system that learning from each component to the entire dialog policy for task success.

reinforcement-learning Reinforcement Learning (RL)

Empirical Strategy for Stretching Probability Distribution in Neural-network-based Regression

no code implementations8 Sep 2020 Eunho Koo, HyungJun Kim

In this study, we considered the distribution error, i. e., the inconsistency of two distributions (those of the predicted values and label), as the prediction error, and proposed weighted empirical stretching (WES) as a novel loss function to increase the overlap area of the two distributions.

regression

BinaryDuo: Reducing Gradient Mismatch in Binary Activation Network by Coupling Binary Activations

1 code implementation ICLR 2020 Hyungjun Kim, Kyung-Su Kim, Jinseok Kim, Jae-Joon Kim

Binary Neural Networks (BNNs) have been garnering interest thanks to their compute cost reduction and memory savings.

Zero-shifting Technique for Deep Neural Network Training on Resistive Cross-point Arrays

no code implementations24 Jul 2019 Hyungjun Kim, Malte Rasch, Tayfun Gokmen, Takashi Ando, Hiroyuki Miyazoe, Jae-Joon Kim, John Rozen, Seyoung Kim

By using this zero-shifting method, we show that network performance dramatically improves for imbalanced synapse devices.

BitSplit-Net: Multi-bit Deep Neural Network with Bitwise Activation Function

no code implementations23 Mar 2019 Hyungjun Kim, Yulhwa Kim, Sungju Ryu, Jae-Joon Kim

We demonstrate that the BitSplit version of LeNet-5, VGG-9, AlexNet, and ResNet-18 can be trained to have similar classification accuracy at a lower computational cost compared to conventional multi-bit networks with low bit precision (<= 4-bit).

Neural Network-Hardware Co-design for Scalable RRAM-based BNN Accelerators

1 code implementation6 Nov 2018 Yulhwa Kim, HyungJun Kim, Jae-Joon Kim

Recently, RRAM-based Binary Neural Network (BNN) hardware has been gaining interests as it requires 1-bit sense-amp only and eliminates the need for high-resolution ADC and DAC.

Neural Network simulation

Deep Neural Network Optimized to Resistive Memory with Nonlinear Current-Voltage Characteristics

no code implementations30 Mar 2017 Hyungjun Kim, Taesu Kim, Jinseok Kim, Jae-Joon Kim

Artificial Neural Network computation relies on intensive vector-matrix multiplications.

Cannot find the paper you are looking for? You can Submit a new open access paper.