Search Results for author: Lequn Chen

Found 7 papers, 2 papers with code

Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

1 code implementation29 Oct 2023 Yilong Zhao, Chien-Yu Lin, Kan Zhu, Zihao Ye, Lequn Chen, Size Zheng, Luis Ceze, Arvind Krishnamurthy, Tianqi Chen, Baris Kasikci

To maximize LLMs' serving throughput, we introduce Atom, a low-bit quantization method that achieves high throughput improvements with negligible accuracy loss.

Quantization Sentiment Analysis

Punica: Multi-Tenant LoRA Serving

1 code implementation28 Oct 2023 Lequn Chen, Zihao Ye, Yongji Wu, Danyang Zhuo, Luis Ceze, Arvind Krishnamurthy

Our scheduler consolidates multi-tenant LoRA serving workloads in a shared GPU cluster.

Symphony: Optimized DNN Model Serving using Deferred Batch Scheduling

no code implementations14 Aug 2023 Lequn Chen, Weixin Deng, Anirudh Canumalla, Yu Xin, Danyang Zhuo, Matthai Philipose, Arvind Krishnamurthy

However, existing model serving systems cannot achieve adequate batch sizes while meeting latency objectives as these systems eagerly dispatch requests to accelerators to minimize the accelerator idle time.

Scheduling

Multimodal sensor fusion for real-time location-dependent defect detection in laser-directed energy deposition

no code implementations23 May 2023 Lequn Chen, Xiling Yao, Wenhe Feng, Youxiang Chew, Seung Ki Moon

Traditional in-situ monitoring approach utilizes a single sensor (i. e., acoustic, visual, or thermal sensor) to capture the complex process dynamic behaviors, which is insufficient for defect detection with high accuracy and robustness.

Defect Detection Sensor Fusion

Multisensor fusion-based digital twin in additive manufacturing for in-situ quality monitoring and defect correction

no code implementations12 Apr 2023 Lequn Chen, Xiling Yao, Kui Liu, Chaolin Tan, Seung Ki Moon

Early detection and correction of defects are critical in additive manufacturing (AM) to avoid build failures.

ADARES: Adaptive Resource Management for Virtual Machines

no code implementations5 Dec 2018 Ignacio Cano, Lequn Chen, Pedro Fonseca, Tianqi Chen, Chern Cheah, Karan Gupta, Ramesh Chandra, Arvind Krishnamurthy

Our large-scale analysis confirms that VMs are often misconfigured, either overprovisioned or underprovisioned, and that this problem is pervasive across a wide range of private clusters.

Management Multi-Armed Bandits +1

Cannot find the paper you are looking for? You can Submit a new open access paper.