Search Results for author: Nam Sung Kim

Found 13 papers, 3 papers with code

SPADE: Sparse Pillar-based 3D Object Detection Accelerator for Autonomous Driving

no code implementations • 12 May 2023 • Minjae Lee, Seongmin Park, Hyungmin Kim, Minyong Yoon, Janghwan Lee, Jun Won Choi, Nam Sung Kim, Mingu Kang, Jungwook Choi

3D object detection using point cloud (PC) data is essential for perception pipelines of autonomous driving, where efficient encoding is key to meeting stringent resource and latency requirements.

3D Object Detection Autonomous Driving +2

Paper
Add Code

Defensive ML: Defending Architectural Side-channels with Adversarial Obfuscation

no code implementations • 3 Feb 2023 • Hyoungwook Nam, Raghavendra Pradyumna Pothukuchi, Bo Li, Nam Sung Kim, Josep Torrellas

To address this problem, this paper explores using Adversarial Machine Learning (AML) methods as a defense at the computer architecture layer to obfuscate side channels.

Computer Security

Paper
Add Code

BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node Sampling

1 code implementation • 21 Mar 2022 • Cheng Wan, Youjie Li, Ang Li, Nam Sung Kim, Yingyan Lin

Graph Convolutional Networks (GCNs) have emerged as the state-of-the-art method for graph-based learning tasks.

Ranked #1 on Node Classification on Reddit

Node Classification

Paper
Code

PipeGCN: Efficient Full-Graph Training of Graph Convolutional Networks with Pipelined Feature Communication

1 code implementation • ICLR 2022 • Cheng Wan, Youjie Li, Cameron R. Wolfe, Anastasios Kyrillidis, Nam Sung Kim, Yingyan Lin

Notably, little is known regarding the convergence rate of GCN training with both stale features and stale feature gradients.

Paper
Code

Harmony: Overcoming the Hurdles of GPU Memory Capacity to Train Massive DNN Models on Commodity Servers

1 code implementation • 2 Feb 2022 • Youjie Li, Amar Phanishayee, Derek Murray, Jakub Tarnawski, Nam Sung Kim

Deep neural networks (DNNs) have grown exponentially in size over the past decade, leaving only those who have massive datacenter-based resources with the ability to develop and train such models.

Paper
Code

BDS-GCN: Efficient Full-Graph Training of Graph Convolutional Nets with Partition-Parallelism and Boundary Sampling

no code implementations • 1 Jan 2021 • Cheng Wan, Youjie Li, Nam Sung Kim, Yingyan Lin

While it can be natural to leverage graph partition and distributed training for tackling this challenge, this direction has only been slightly touched on previously due to the unique challenge posed by the GCN structures, especially the excessive amount of boundary nodes in each partitioned subgraph, which can easily explode the required memory and communications for distributed training of GCNs.

Paper
Add Code

Intra-layer Neural Architecture Search

no code implementations • 1 Jan 2021 • Dong Kai Wang, Nam Sung Kim

This work addresses NAS challenges in a search space of weight connections within layers, specifically the large number of architecture variations compared to a high-level search space with predetermined layer types.

Multi-Task Learning Neural Architecture Search

Paper
Add Code

IOCA: High-Speed I/O-Aware LLC Management for Network-Centric Multi-Tenant Platform

no code implementations • 9 Jul 2020 • Yifan Yuan, Mohammad Alian, Yipeng Wang, Ilia Kurakin, Ren Wang, Charlie Tai, Nam Sung Kim

In this paper, we argue that besides CPU cores, high-speed network I/O is also important for LLC management.

Hardware Architecture Operating Systems

Paper
Add Code

Bit-Parallel Vector Composability for Neural Acceleration

no code implementations • 11 Apr 2020 • Soroush Ghodrati, Hardik Sharma, Cliff Young, Nam Sung Kim, Hadi Esmaeilzadeh

This paper explores a different design style, where each unit is only responsible for a slice of the bit-level operations to interleave and combine the benefits of bit-level parallelism with the abundant data-level parallelism in deep neural networks.

Paper
Add Code

Mixed-Signal Charge-Domain Acceleration of Deep Neural networks through Interleaved Bit-Partitioned Arithmetic

no code implementations • 27 Jun 2019 • Soroush Ghodrati, Hardik Sharma, Sean Kinzer, Amir Yazdanbakhsh, Kambiz Samadi, Nam Sung Kim, Doug Burger, Hadi Esmaeilzadeh

Low-power potential of mixed-signal design makes it an alluring option to accelerate Deep Neural Networks (DNNs).

Hardware Architecture

Paper
Add Code

GradiVeQ: Vector Quantization for Bandwidth-Efficient Gradient Aggregation in Distributed CNN Training

no code implementations • NeurIPS 2018 • Mingchao Yu, Zhifeng Lin, Krishna Narra, Songze Li, Youjie Li, Nam Sung Kim, Alexander Schwing, Murali Annavaram, Salman Avestimehr

Data parallelism can boost the training speed of convolutional neural networks (CNN), but could suffer from significant communication costs caused by gradient aggregation.

Dimensionality Reduction Quantization

Paper
Add Code

Pipe-SGD: A Decentralized Pipelined SGD Framework for Distributed Deep Net Training

no code implementations • NeurIPS 2018 • Youjie Li, Mingchao Yu, Songze Li, Salman Avestimehr, Nam Sung Kim, Alexander Schwing

Distributed training of deep nets is an important technique to address some of the present day computing challenges like memory consumption and computational demands.

Paper
Add Code

GANAX: A Unified MIMD-SIMD Acceleration for Generative Adversarial Networks

no code implementations • 10 May 2018 • Amir Yazdanbakhsh, Hajar Falahati, Philip J. Wolfe, Kambiz Samadi, Nam Sung Kim, Hadi Esmaeilzadeh

Even though there is a convolution stage in this operator, the inserted zeros lead to underutilization of the compute resources when a conventional convolution accelerator is employed.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.