Search Results for author: Matthew Lentz

Found 3 papers, 0 papers with code

Adaptive Skeleton Graph Decoding

no code implementations • 19 Feb 2024 • Shuowei Jin, Yongji Wu, Haizhong Zheng, Qingzhao Zhang, Matthew Lentz, Z. Morley Mao, Atul Prakash, Feng Qian, Danyang Zhuo

Large language models (LLMs) have seen significant adoption for natural language tasks, owing their success to massive numbers of model parameters (e. g., 70B+); however, LLM inference incurs significant computation and memory costs.

Paper
Add Code

Computing in the Era of Large Generative Models: From Cloud-Native to AI-Native

no code implementations • 17 Jan 2024 • Yao Lu, Song Bian, Lequn Chen, Yongjun He, Yulong Hui, Matthew Lentz, Beibin Li, Fei Liu, Jialin Li, Qi Liu, Rui Liu, Xiaoxuan Liu, Lin Ma, Kexin Rong, Jianguo Wang, Yingjun Wu, Yongji Wu, Huanchen Zhang, Minjia Zhang, Qizhen Zhang, Tianyi Zhou, Danyang Zhuo

In this paper, we investigate the intersection of large generative AI models and cloud-native computing architectures.

Paper
Add Code

Serving and Optimizing Machine Learning Workflows on Heterogeneous Infrastructures

no code implementations • 10 May 2022 • Yongji Wu, Matthew Lentz, Danyang Zhuo, Yao Lu

With the advent of ubiquitous deployment of smart devices and the Internet of Things, data sources for machine learning inference have increasingly moved to the edge of the network.

AutoML BIG-bench Machine Learning +5

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.