Search Results for author: Arvind Krishnamurthy

Found 15 papers, 5 papers with code

Fast Video Classification via Adaptive Cascading of Deep Models

no code implementations • CVPR 2017 • Haichen Shen, Seungyeop Han, Matthai Philipose, Arvind Krishnamurthy

Recent advances have enabled "oracle" classifiers that can classify across many classes and input distributions with high accuracy without retraining.

Classification Decision Making +2

Paper
Add Code

TVM: An Automated End-to-End Optimizing Compiler for Deep Learning

1 code implementation • 12 Feb 2018 • Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy

Experimental results show that TVM delivers performance across hardware back-ends that are competitive with state-of-the-art, hand-tuned libraries for low-power CPU, mobile GPU, and server-class GPUs.

Paper
Code

Learning to Optimize Tensor Programs

no code implementations • NeurIPS 2018 • Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy

Efficient implementations of tensor operators, such as matrix multiplication and high dimensional convolution, are key enablers of effective deep learning systems.

Paper
Add Code

Parameter Hub: a Rack-Scale Parameter Server for Distributed Deep Neural Network Training

no code implementations • 21 May 2018 • Liang Luo, Jacob Nelson, Luis Ceze, Amar Phanishayee, Arvind Krishnamurthy

Distributed deep neural network (DDNN) training constitutes an increasingly important workload that frequently runs in the cloud.

Paper
Add Code

A Hardware-Software Blueprint for Flexible Deep Learning Specialization

no code implementations • 11 Jul 2018 • Thierry Moreau, Tianqi Chen, Luis Vega, Jared Roesch, Eddie Yan, Lianmin Zheng, Josh Fromm, Ziheng Jiang, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy

Specialized Deep Learning (DL) acceleration stacks, designed for a specific set of frameworks, model architectures, operators, and data types, offer the allure of high performance while sacrificing flexibility.

Code Generation Style Transfer

Paper
Add Code

ADARES: Adaptive Resource Management for Virtual Machines

no code implementations • 5 Dec 2018 • Ignacio Cano, Lequn Chen, Pedro Fonseca, Tianqi Chen, Chern Cheah, Karan Gupta, Ramesh Chandra, Arvind Krishnamurthy

Our large-scale analysis confirms that VMs are often misconfigured, either overprovisioned or underprovisioned, and that this problem is pervasive across a wide range of private clusters.

Management Multi-Armed Bandits +1

Paper
Add Code

Scaling Distributed Machine Learning with In-Network Aggregation

2 code implementations • 22 Feb 2019 • Amedeo Sapio, Marco Canini, Chen-Yu Ho, Jacob Nelson, Panos Kalnis, Changhoon Kim, Arvind Krishnamurthy, Masoud Moshref, Dan R. K. Ports, Peter Richtárik

Training machine learning models in parallel is an increasingly important workload.

BIG-bench Machine Learning

Paper
Code

AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly

1 code implementation • ICLR 2021 • Yuchen Jin, Tianyi Zhou, Liangyu Zhao, Yibo Zhu, Chuanxiong Guo, Marco Canini, Arvind Krishnamurthy

This mutual-training process between BO and the loss-prediction model allows us to limit the training steps invested in the BO search.

Image Classification Machine Translation +1

Paper
Code

Cloud Collectives: Towards Cloud-aware Collectives forML Workloads with Rank Reordering

no code implementations • 28 May 2021 • Liang Luo, Jacob Nelson, Arvind Krishnamurthy, Luis Ceze

ML workloads are becoming increasingly popular in the cloud.

Paper
Add Code

Efficient Direct-Connect Topologies for Collective Communications

no code implementations • 7 Feb 2022 • Liangyu Zhao, Siddharth Pal, Tapan Chugh, Weiyang Wang, Jason Fantl, Prithwish Basu, Joud Khoury, Arvind Krishnamurthy

Our algorithms start from small, optimal base topologies and associated communication schedules and use a set of techniques that can be iteratively applied to derive much larger topologies and schedules.

Paper
Add Code

Bandwidth Optimal Pipeline Schedule for Collective Communication

no code implementations • 29 May 2023 • Liangyu Zhao, Arvind Krishnamurthy

We present a strongly polynomial-time algorithm to generate bandwidth optimal allgather/reduce-scatter on any network topology, with or without switches.

Paper
Add Code

Symphony: Optimized DNN Model Serving using Deferred Batch Scheduling

no code implementations • 14 Aug 2023 • Lequn Chen, Weixin Deng, Anirudh Canumalla, Yu Xin, Danyang Zhuo, Matthai Philipose, Arvind Krishnamurthy

However, existing model serving systems cannot achieve adequate batch sizes while meeting latency objectives as these systems eagerly dispatch requests to accelerators to minimize the accelerator idle time.

Scheduling

Paper
Add Code

Punica: Multi-Tenant LoRA Serving

1 code implementation • 28 Oct 2023 • Lequn Chen, Zihao Ye, Yongji Wu, Danyang Zhuo, Luis Ceze, Arvind Krishnamurthy

Our scheduler consolidates multi-tenant LoRA serving workloads in a shared GPU cluster.

800

Paper
Code

Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

1 code implementation • 29 Oct 2023 • Yilong Zhao, Chien-Yu Lin, Kan Zhu, Zihao Ye, Lequn Chen, Size Zheng, Luis Ceze, Arvind Krishnamurthy, Tianqi Chen, Baris Kasikci

To maximize LLMs' serving throughput, we introduce Atom, a low-bit quantization method that achieves high throughput improvements with negligible accuracy loss.

Quantization Sentiment Analysis

152

Paper
Code

ForestColl: Efficient Collective Communications on Heterogeneous Network Fabrics

no code implementations • 9 Feb 2024 • Liangyu Zhao, Saeed Maleki, Ziyue Yang, Hossein Pourreza, Aashaka Shah, Changho Hwang, Arvind Krishnamurthy

ForestColl also outperforms other state-of-the-art schedule generation techniques with both up to 61\% more efficient generated schedules and orders of magnitude faster schedule generation speed.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.