Search Results for author: Jiarong Xing

Found 4 papers, 1 papers with code

BlendServe: Optimizing Offline Inference for Auto-regressive Large Models with Resource-aware Batching

no code implementations25 Nov 2024 Yilong Zhao, Shuo Yang, Kan Zhu, Lianmin Zheng, Baris Kasikci, Yang Zhou, Jiarong Xing, Ion Stoica

Offline batch inference, which leverages the flexibility of request batching to achieve higher throughput and lower costs, is becoming more popular for latency-insensitive applications.

Symbolic Distillation for Learned TCP Congestion Control

1 code implementation24 Oct 2022 S P Sharan, Wenqing Zheng, Kuo-Feng Hsu, Jiarong Xing, Ang Chen, Zhangyang Wang

At the core of our proposal is a novel symbolic branching algorithm that enables the rule to be aware of the context in terms of various network conditions, eventually converting the NN policy into a symbolic tree.

Deep Reinforcement Learning Reinforcement Learning (RL)

Bolt: Bridging the Gap between Auto-tuners and Hardware-native Performance

no code implementations25 Oct 2021 Jiarong Xing, Leyuan Wang, Shang Zhang, Jack Chen, Ang Chen, Yibo Zhu

Today's auto-tuners (e. g., AutoTVM, Ansor) generate efficient tensor programs by navigating a large search space to identify effective implementations, but they do so with opaque hardware details.

Cannot find the paper you are looking for? You can Submit a new open access paper.