Search Results for author: Hyesoon Kim

Found 8 papers, 0 papers with code

Hydro: Adaptive Query Processing of ML Queries

no code implementations22 Mar 2024 Gaurav Tarlok Kakkar, Jiashen Cao, Aubhro Sengupta, Joy Arulraj, Hyesoon Kim

Second, the optimal query plan for ML queries is data-dependent, necessitating DBMSs to adapt the query plan on the fly during execution.

VEGETA: Vertically-Integrated Extensions for Sparse/Dense GEMM Tile Acceleration on CPUs

no code implementations17 Feb 2023 Geonhwa Jeong, Sana Damani, Abhimanyu Rajeshkumar Bambhaniya, Eric Qin, Christopher J. Hughes, Sreenivas Subramoney, Hyesoon Kim, Tushar Krishna

Therefore, as DL workloads embrace sparsity to reduce the computations and memory size of models, it is also imperative for CPUs to add support for sparsity to avoid under-utilization of the dense matrix engine and inefficient usage of the caches and registers.

RASA: Efficient Register-Aware Systolic Array Matrix Engine for CPU

no code implementations5 Oct 2021 Geonhwa Jeong, Eric Qin, Ananda Samajdar, Christopher J. Hughes, Sreenivas Subramoney, Hyesoon Kim, Tushar Krishna

As AI-based applications become pervasive, CPU vendors are starting to incorporate matrix engines within the datapath to boost efficiency.

Reducing Inference Latency with Concurrent Architectures for Image Recognition

no code implementations13 Nov 2020 Ramyad Hadidi, Jiashen Cao, Michael S. Ryoo, Hyesoon Kim

Satisfying the high computation demand of modern deep learning architectures is challenging for achieving low inference latency.

Neural Architecture Search

LCP: A Low-Communication Parallelization Method for Fast Neural Network Inference in Image Recognition

no code implementations13 Mar 2020 Ramyad Hadidi, Bahar Asgari, Jiashen Cao, Younmin Bae, Da Eun Shim, Hyojong Kim, Sung-Kyu Lim, Michael S. Ryoo, Hyesoon Kim

To benefit from available compute resources with low communication overhead, we propose the first DNN parallelization method for reducing the communication overhead in a distributed system.

Quantization

A Case Study: Exploiting Neural Machine Translation to Translate CUDA to OpenCL

no code implementations18 May 2019 Yonghae Kim, Hyesoon Kim

The sequence-to-sequence (seq2seq) model for neural machine translation has significantly improved the accuracy of language translation.

Machine Translation Translation

Collaborative Execution of Deep Neural Networks on Internet of Things Devices

no code implementations8 Jan 2019 Ramyad Hadidi, Jiashen Cao, Micheal S. Ryoo, Hyesoon Kim

In this paper, we propose an approach that utilizes aggregated existing computing power of Internet of Things (IoT) devices surrounding an environment by creating a collaborative network.

Cannot find the paper you are looking for? You can Submit a new open access paper.