Search Results for author: Jiayi Yao

Found 4 papers, 4 papers with code

White-box Compiler Fuzzing Empowered by Large Language Models

1 code implementation24 Oct 2023 Chenyuan Yang, Yinlin Deng, Runyu Lu, Jiayi Yao, Jiawei Liu, Reyhaneh Jabbarvand, Lingming Zhang

Nonetheless, prompting LLMs with compiler source-code information remains a missing piece of research in compiler testing.

Code Generation Compiler Optimization

CacheGen: KV Cache Compression and Streaming for Fast Language Model Serving

1 code implementation11 Oct 2023 YuHan Liu, Hanchen Li, Yihua Cheng, Siddhant Ray, YuYang Huang, Qizheng Zhang, Kuntai Du, Jiayi Yao, Shan Lu, Ganesh Ananthanarayanan, Michael Maire, Henry Hoffmann, Ari Holtzman, Junchen Jiang

Compared to the recent systems that reuse the KV cache, CacheGen reduces the KV cache size by 3. 7-4. 3x and the total delay in fetching and processing contexts by 2. 7-3. 2x while having negligible impact on the LLM response quality in accuracy or perplexity.

Language Modelling Quantization

A pruning method based on the dissimilarity of angle among channels and filters

1 code implementation29 Oct 2022 Jiayi Yao, Ping Li, Xiatao Kang, Yuzhe Wang

Firstly, we train a sparse model by GL penalty, and impose an angle dissimilarity constraint on the channels and filters of convolutional network to obtain a more sparse structure.

Network Pruning

Neural Network Panning: Screening the Optimal Sparse Network Before Training

1 code implementation27 Sep 2022 Xiatao Kang, Ping Li, Jiayi Yao, Chengxi Li

Pruning on neural networks before training not only compresses the original models, but also accelerates the network training phase, which has substantial application value.

Network Pruning Scheduling

Cannot find the paper you are looking for? You can Submit a new open access paper.