Search Results for author: Youngsok Kim

Found 11 papers, 5 papers with code

Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System

1 code implementation11 Mar 2024 Hongsun Jang, Jaeyong Song, Jaewon Jung, Jaeyoung Park, Youngsok Kim, Jinho Lee

Our work, Smart-Infinity, addresses the storage bandwidth bottleneck of storage-offloaded LLM training using near-storage processing devices on a real system.

Language Modelling Large Language Model

GraNNDis: Efficient Unified Distributed Training Framework for Deep GNNs on Large Clusters

no code implementations12 Nov 2023 Jaeyong Song, Hongsun Jang, Jaewon Jung, Youngsok Kim, Jinho Lee

According to the growth in the dataset and the model size used for GNNs, an important problem is that it becomes nearly impossible to keep the whole network on GPU memory.

Pipe-BD: Pipelined Parallel Blockwise Distillation

1 code implementation29 Jan 2023 Hongsun Jang, Jaewon Jung, Jaeyong Song, Joonsang Yu, Youngsok Kim, Jinho Lee

However, this results in a high overhead of redundant teacher execution, low GPU utilization, and extra data loading.

SGCN: Exploiting Compressed-Sparse Features in Deep Graph Convolutional Network Accelerators

1 code implementation25 Jan 2023 Mingi Yoo, Jaeyong Song, Jounghoo Lee, Namhyung Kim, Youngsok Kim, Jinho Lee

A GCN takes as input an arbitrarily structured graph and executes a series of layers which exploit the graph's structure to calculate their output features.

Feature Compression

Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication Compression

no code implementations24 Jan 2023 Jaeyong Song, Jinkyu Yim, Jaewon Jung, Hongsun Jang, Hyung-Jin Kim, Youngsok Kim, Jinho Lee

Compressing the communication is one way to mitigate the overhead by reducing the inter-node traffic volume; however, the existing compression techniques have critical limitations to be applied for NLP models with 3D parallelism in that 1) only the data parallelism traffic is targeted, and 2) the existing compression schemes already harm the model quality too much.

Slice-and-Forge: Making Better Use of Caches for Graph Convolutional Network Accelerators

no code implementations24 Jan 2023 Mingi Yoo, Jaeyong Song, Hyeyoon Lee, Jounghoo Lee, Namhyung Kim, Youngsok Kim, Jinho Lee

Graph convolutional networks (GCNs) are becoming increasingly popular as they can process a wide variety of data formats that prior deep neural networks cannot easily support.

Enabling Hard Constraints in Differentiable Neural Network and Accelerator Co-Exploration

no code implementations23 Jan 2023 Deokki Hong, Kanghyun Choi, Hye Yoon Lee, Joonsang Yu, Noseong Park, Youngsok Kim, Jinho Lee

Co-exploration of an optimal neural architecture and its hardware accelerator is an approach of rising interest which addresses the computational cost problem, especially in low-profile systems.

Neural Architecture Search

ConCoDE: Hard-constrained Differentiable Co-Exploration Method for Neural Architectures and Hardware Accelerators

no code implementations29 Sep 2021 Deokki Hong, Kanghyun Choi, Hey Yoon Lee, Joonsang Yu, Youngsok Kim, Noseong Park, Jinho Lee

To handle the hard constraint problem of differentiable co-exploration, we propose ConCoDE, which searches for hard-constrained solutions without compromising the global design objectives.

Neural Architecture Search

DANCE: Differentiable Accelerator/Network Co-Exploration

no code implementations14 Sep 2020 Kanghyun Choi, Deokki Hong, Hojae Yoon, Joonsang Yu, Youngsok Kim, Jinho Lee

In such circumstances, this work presents DANCE, a differentiable approach towards the co-exploration of the hardware accelerator and network architecture design.

Neural Architecture Search

Cannot find the paper you are looking for? You can Submit a new open access paper.