Search Results for author: Christos Kozyrakis

Found 13 papers, 3 papers with code

AI Metropolis: Scaling Large Language Model-based Multi-Agent Simulation with Out-of-order Execution

no code implementations5 Nov 2024 Zhiqiang Xie, Hao Kang, Ying Sheng, Tushar Krishna, Kayvon Fatahalian, Christos Kozyrakis

With more advanced natural language understanding and reasoning capabilities, large language model (LLM)-powered agents are increasingly developed in simulated environments to perform complex tasks, interact with other agents, and exhibit emergent behaviors relevant to social science and gaming.

Language Modeling Language Modelling +3

Cloud Atlas: Efficient Fault Localization for Cloud Systems using Language Models and Causal Insight

no code implementations11 Jul 2024 Zhiqiang Xie, Yujia Zheng, Lizi Ottens, Kun Zhang, Christos Kozyrakis, Jonathan Mace

We evaluate Atlas across a range of fault localization scenarios and demonstrate that Atlas is capable of generating causal graphs in a scalable and generalizable manner, with performance that far surpasses that of data-driven algorithms and is commensurate to the ground-truth baseline.

Causal Discovery Fault localization

ReCycle: Resilient Training of Large DNNs using Pipeline Adaptation

no code implementations22 May 2024 Swapnil Gandhi, Mark Zhao, Athinagoras Skiadopoulos, Christos Kozyrakis

We describe a prototype for ReCycle and show that it achieves high training throughput under multiple failures, outperforming recent proposals for fault-tolerant training such as Oobleck and Bamboo by up to $1. 46\times$ and $1. 64\times$, respectively.

cedar: Optimized and Unified Machine Learning Input Data Pipelines

1 code implementation17 Jan 2024 Mark Zhao, Emanuel Adamiak, Christos Kozyrakis

Across eight pipelines, cedar improves performance by up to 1. 87x to 10. 65x compared to state-of-the-art input data systems.

FlexShard: Flexible Sharding for Industry-Scale Sequence Recommendation Models

no code implementations8 Jan 2023 Geet Sethi, Pallab Bhattacharya, Dhruv Choudhary, Carole-Jean Wu, Christos Kozyrakis

Sequence-based deep learning recommendation models (DLRMs) are an emerging class of DLRMs showing great improvements over their prior sum-pooling based counterparts at capturing users' long term interests.

RecD: Deduplication for End-to-End Deep Learning Recommendation Model Training Infrastructure

no code implementations9 Nov 2022 Mark Zhao, Dhruv Choudhary, Devashish Tyagi, Ajay Somani, Max Kaplan, Sung-Han Lin, Sarunya Pumma, Jongsoo Park, Aarti Basant, Niket Agarwal, Carole-Jean Wu, Christos Kozyrakis

RecD addresses immense storage, preprocessing, and training overheads caused by feature duplication inherent in industry-scale DLRM training datasets.

RecShard: Statistical Feature-Based Memory Optimization for Industry-Scale Neural Recommendation

no code implementations25 Jan 2022 Geet Sethi, Bilge Acun, Niket Agarwal, Christos Kozyrakis, Caroline Trippel, Carole-Jean Wu

EMBs exhibit distinct memory characteristics, providing performance optimization opportunities for intelligent EMB partitioning and placement across a tiered memory hierarchy.

INFaaS: A Model-less and Managed Inference Serving System

1 code implementation30 May 2019 Francisco Romero, Qian Li, Neeraja J. Yadwadkar, Christos Kozyrakis

This paper introduces INFaaS, a managed and model-less system for distributed inference serving, where developers simply specify the performance and accuracy requirements for their applications without needing to specify a specific model-variant for each query.

Model Selection

DNN Dataflow Choice Is Overrated

no code implementations10 Sep 2018 Xuan Yang, Mingyu Gao, Jing Pu, Ankita Nayak, Qiaoyi Liu, Steven Emberton Bell, Jeff Ou Setter, Kaidi Cao, Heonjae Ha, Christos Kozyrakis, Mark Horowitz

Many DNN accelerators have been proposed and built using different microarchitectures and program mappings.

Distributed, Parallel, and Cluster Computing

Cannot find the paper you are looking for? You can Submit a new open access paper.