Search Results for author: Nam Sung Kim

Found 13 papers, 3 papers with code

SPADE: Sparse Pillar-based 3D Object Detection Accelerator for Autonomous Driving

no code implementations12 May 2023 Minjae Lee, Seongmin Park, Hyungmin Kim, Minyong Yoon, Janghwan Lee, Jun Won Choi, Nam Sung Kim, Mingu Kang, Jungwook Choi

3D object detection using point cloud (PC) data is essential for perception pipelines of autonomous driving, where efficient encoding is key to meeting stringent resource and latency requirements.

3D Object Detection Autonomous Driving +2

Defensive ML: Defending Architectural Side-channels with Adversarial Obfuscation

no code implementations3 Feb 2023 Hyoungwook Nam, Raghavendra Pradyumna Pothukuchi, Bo Li, Nam Sung Kim, Josep Torrellas

To address this problem, this paper explores using Adversarial Machine Learning (AML) methods as a defense at the computer architecture layer to obfuscate side channels.

Computer Security

Harmony: Overcoming the Hurdles of GPU Memory Capacity to Train Massive DNN Models on Commodity Servers

1 code implementation2 Feb 2022 Youjie Li, Amar Phanishayee, Derek Murray, Jakub Tarnawski, Nam Sung Kim

Deep neural networks (DNNs) have grown exponentially in size over the past decade, leaving only those who have massive datacenter-based resources with the ability to develop and train such models.

BDS-GCN: Efficient Full-Graph Training of Graph Convolutional Nets with Partition-Parallelism and Boundary Sampling

no code implementations1 Jan 2021 Cheng Wan, Youjie Li, Nam Sung Kim, Yingyan Lin

While it can be natural to leverage graph partition and distributed training for tackling this challenge, this direction has only been slightly touched on previously due to the unique challenge posed by the GCN structures, especially the excessive amount of boundary nodes in each partitioned subgraph, which can easily explode the required memory and communications for distributed training of GCNs.

Intra-layer Neural Architecture Search

no code implementations1 Jan 2021 Dong Kai Wang, Nam Sung Kim

This work addresses NAS challenges in a search space of weight connections within layers, specifically the large number of architecture variations compared to a high-level search space with predetermined layer types.

Multi-Task Learning Neural Architecture Search

IOCA: High-Speed I/O-Aware LLC Management for Network-Centric Multi-Tenant Platform

no code implementations9 Jul 2020 Yifan Yuan, Mohammad Alian, Yipeng Wang, Ilia Kurakin, Ren Wang, Charlie Tai, Nam Sung Kim

In this paper, we argue that besides CPU cores, high-speed network I/O is also important for LLC management.

Hardware Architecture Operating Systems

Bit-Parallel Vector Composability for Neural Acceleration

no code implementations11 Apr 2020 Soroush Ghodrati, Hardik Sharma, Cliff Young, Nam Sung Kim, Hadi Esmaeilzadeh

This paper explores a different design style, where each unit is only responsible for a slice of the bit-level operations to interleave and combine the benefits of bit-level parallelism with the abundant data-level parallelism in deep neural networks.

Mixed-Signal Charge-Domain Acceleration of Deep Neural networks through Interleaved Bit-Partitioned Arithmetic

no code implementations27 Jun 2019 Soroush Ghodrati, Hardik Sharma, Sean Kinzer, Amir Yazdanbakhsh, Kambiz Samadi, Nam Sung Kim, Doug Burger, Hadi Esmaeilzadeh

Low-power potential of mixed-signal design makes it an alluring option to accelerate Deep Neural Networks (DNNs).

Hardware Architecture

Pipe-SGD: A Decentralized Pipelined SGD Framework for Distributed Deep Net Training

no code implementations NeurIPS 2018 Youjie Li, Mingchao Yu, Songze Li, Salman Avestimehr, Nam Sung Kim, Alexander Schwing

Distributed training of deep nets is an important technique to address some of the present day computing challenges like memory consumption and computational demands.

GANAX: A Unified MIMD-SIMD Acceleration for Generative Adversarial Networks

no code implementations10 May 2018 Amir Yazdanbakhsh, Hajar Falahati, Philip J. Wolfe, Kambiz Samadi, Nam Sung Kim, Hadi Esmaeilzadeh

Even though there is a convolution stage in this operator, the inserted zeros lead to underutilization of the compute resources when a conventional convolution accelerator is employed.

Cannot find the paper you are looking for? You can Submit a new open access paper.