Search Results for author: Hyoukjun Kwon

Found 10 papers, 1 papers with code

NonGEMM Bench: Understanding the Performance Horizon of the Latest ML Workloads with NonGEMM Workloads

no code implementations • 17 Apr 2024 • Rachid Karami, Hemanth Kota, Sheng-Chun Kao, Hyoukjun Kwon

Therefore, significant effort has been put to study and optimize the GEMM operators in order to speed up the execution of ML models.

Paper
Add Code

Inter-Layer Scheduling Space Exploration for Multi-model Inference on Heterogeneous Chiplets

no code implementations • 14 Dec 2023 • Mohanad Odema, Hyoukjun Kwon, Mohammad Abdullah Al Faruque

To address increasing compute demand from recent multi-model workloads with heavy models like large language models, we propose to deploy heterogeneous chiplet-based multi-chip module (MCM)-based accelerators.

Scheduling

Paper
Add Code

DREAM: A Dynamic Scheduler for Dynamic Real-time Multi-model ML Workloads

no code implementations • 7 Dec 2022 • Seah Kim, Hyoukjun Kwon, Jinook Song, Jihyuck Jo, Yu-Hsin Chen, Liangzhen Lai, Vikas Chandra

Such dynamic behaviors introduce new challenges to the system software in an ML system since the overall system load is not completely predictable, unlike traditional ML workloads.

Scheduling

Paper
Add Code

XRBench: An Extended Reality (XR) Machine Learning Benchmark Suite for the Metaverse

no code implementations • 16 Nov 2022 • Hyoukjun Kwon, Krishnakumar Nair, Jamin Seo, Jason Yik, Debabrata Mohapatra, Dongyuan Zhan, Jinook Song, Peter Capak, Peizhao Zhang, Peter Vajda, Colby Banbury, Mark Mazumder, Liangzhen Lai, Ashish Sirasao, Tushar Krishna, Harshit Khaitan, Vikas Chandra, Vijay Janapa Reddi

We hope that our work will stimulate research and lead to the development of a new generation of ML systems for XR use cases.

Paper
Add Code

Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation

1 code implementation • CVPR 2022 • Jiaqi Gu, Hyoukjun Kwon, Dilin Wang, Wei Ye, Meng Li, Yu-Hsin Chen, Liangzhen Lai, Vikas Chandra, David Z. Pan

Therefore, we propose HRViT, which enhances ViTs to learn semantically-rich and spatially-precise multi-scale representations by integrating high-resolution multi-branch architectures with ViTs.

Ranked #24 on Semantic Segmentation on Cityscapes val

Image Classification Representation Learning +3

174

Paper
Code

Evaluating Spatial Accelerator Architectures with Tiled Matrix-Matrix Multiplication

no code implementations • 19 Jun 2021 • Gordon E. Moon, Hyoukjun Kwon, Geonhwa Jeong, Prasanth Chatarasi, Sivasankaran Rajamanickam, Tushar Krishna

There is a growing interest in custom spatial accelerators for machine learning applications.

Scheduling

Paper
Add Code

Marvel: A Data-centric Compiler for DNN Operators on Spatial Accelerators

no code implementations • 18 Feb 2020 • Prasanth Chatarasi, Hyoukjun Kwon, Natesh Raina, Saurabh Malik, Vaisakh Haridas, Angshuman Parashar, Michael Pellauer, Tushar Krishna, Vivek Sarkar

Searching for the optimal mappings is challenging because of the large space of mappings, and this challenge gets exacerbated with new operators and diverse accelerator configurations. To address this challenge, we propose a decoupled off-chip/on-chip approach that decomposes the mapping space into off-chip and on-chip subspaces, and first optimizes the off-chip subspace followed by the on-chip subspace.

Paper
Add Code

Co-Exploration of Neural Architectures and Heterogeneous ASIC Accelerator Designs Targeting Multiple Tasks

no code implementations • 10 Feb 2020 • Lei Yang, Zheyu Yan, Meng Li, Hyoukjun Kwon, Liangzhen Lai, Tushar Krishna, Vikas Chandra, Weiwen Jiang, Yiyu Shi

Neural Architecture Search (NAS) has demonstrated its power on various AI accelerating platforms such as Field Programmable Gate Arrays (FPGAs) and Graphic Processing Units (GPUs).

Neural Architecture Search

Paper
Add Code

Heterogeneous Dataflow Accelerators for Multi-DNN Workloads

no code implementations • 13 Sep 2019 • Hyoukjun Kwon, Liangzhen Lai, Tushar Krishna, Vikas Chandra

The results suggest that HDA is an alternative class of Pareto-optimal accelerators to RDA with strength in energy, which can be a better choice than RDAs depending on the use cases.

Distributed, Parallel, and Cluster Computing

Paper
Add Code

Understanding Reuse, Performance, and Hardware Cost of DNN Dataflows: A Data-Centric Approach Using MAESTRO

no code implementations • 4 May 2018 • Hyoukjun Kwon, Prasanth Chatarasi, Michael Pellauer, Angshuman Parashar, Vivek Sarkar, Tushar Krishna

The data partitioning and scheduling strategies used by DNN accelerators to leverage reuse and perform staging are known as dataflow, and they directly impact the performance and energy efficiency of DNN accelerator designs.

Scheduling valid

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.