Search Results for author: Christos Kozyrakis

Found 9 papers, 2 papers with code

cedar: Composable and Optimized Machine Learning Input Data Pipelines

no code implementations17 Jan 2024 Mark Zhao, Emanuel Adamiak, Christos Kozyrakis

The input data pipeline is an essential component of each machine learning (ML) training job.

Efficiently Programming Large Language Models using SGLang

1 code implementation12 Dec 2023 Lianmin Zheng, Liangsheng Yin, Zhiqiang Xie, Jeff Huang, Chuyue Sun, Cody Hao Yu, Shiyi Cao, Christos Kozyrakis, Ion Stoica, Joseph E. Gonzalez, Clark Barrett, Ying Sheng

SGLang is designed for the efficient programming of LLMs and incorporates primitives for common LLM programming patterns.

FlexShard: Flexible Sharding for Industry-Scale Sequence Recommendation Models

no code implementations8 Jan 2023 Geet Sethi, Pallab Bhattacharya, Dhruv Choudhary, Carole-Jean Wu, Christos Kozyrakis

Sequence-based deep learning recommendation models (DLRMs) are an emerging class of DLRMs showing great improvements over their prior sum-pooling based counterparts at capturing users' long term interests.

RecD: Deduplication for End-to-End Deep Learning Recommendation Model Training Infrastructure

no code implementations9 Nov 2022 Mark Zhao, Dhruv Choudhary, Devashish Tyagi, Ajay Somani, Max Kaplan, Sung-Han Lin, Sarunya Pumma, Jongsoo Park, Aarti Basant, Niket Agarwal, Carole-Jean Wu, Christos Kozyrakis

RecD addresses immense storage, preprocessing, and training overheads caused by feature duplication inherent in industry-scale DLRM training datasets.

RecShard: Statistical Feature-Based Memory Optimization for Industry-Scale Neural Recommendation

no code implementations25 Jan 2022 Geet Sethi, Bilge Acun, Niket Agarwal, Christos Kozyrakis, Caroline Trippel, Carole-Jean Wu

EMBs exhibit distinct memory characteristics, providing performance optimization opportunities for intelligent EMB partitioning and placement across a tiered memory hierarchy.

INFaaS: A Model-less and Managed Inference Serving System

1 code implementation30 May 2019 Francisco Romero, Qian Li, Neeraja J. Yadwadkar, Christos Kozyrakis

This paper introduces INFaaS, a managed and model-less system for distributed inference serving, where developers simply specify the performance and accuracy requirements for their applications without needing to specify a specific model-variant for each query.

Model Selection

DNN Dataflow Choice Is Overrated

no code implementations10 Sep 2018 Xuan Yang, Mingyu Gao, Jing Pu, Ankita Nayak, Qiaoyi Liu, Steven Emberton Bell, Jeff Ou Setter, Kaidi Cao, Heonjae Ha, Christos Kozyrakis, Mark Horowitz

Many DNN accelerators have been proposed and built using different microarchitectures and program mappings.

Distributed, Parallel, and Cluster Computing

Cannot find the paper you are looking for? You can Submit a new open access paper.