Search Results for author: Haimeng Ren

Found 2 papers, 0 papers with code

COMET: Towards Partical W4A4KV4 LLMs Serving

no code implementations16 Oct 2024 Lian Liu, Haimeng Ren, Long Cheng, Zhaohui Xu, Yudong Pan, Mengdi Wang, Xiaowei Li, Yinhe Han, Ying Wang

We integrate the optimized W4Ax kernel into our inference framework, COMET, and provide efficient management to support popular LLMs such as LLaMA-3-70B.

Quantization Scheduling

ChipGPT: How far are we from natural language hardware design

no code implementations23 May 2023 Kaiyan Chang, Ying Wang, Haimeng Ren, Mengdi Wang, Shengwen Liang, Yinhe Han, Huawei Li, Xiaowei Li

As large language models (LLMs) like ChatGPT exhibited unprecedented machine intelligence, it also shows great performance in assisting hardware engineers to realize higher-efficiency logic design via natural language interaction.

Cannot find the paper you are looking for? You can Submit a new open access paper.