Search Results for author: Jiebin Zhang

Found 3 papers, 2 papers with code

More Tokens, Lower Precision: Towards the Optimal Token-Precision Trade-off in KV Cache Compression

no code implementations17 Dec 2024 Jiebin Zhang, Dawei Zhu, YiFan Song, Wenhao Wu, Chuqiao Kuang, Xiaoguang Li, Lifeng Shang, Qun Liu, Sujian Li

As large language models (LLMs) process increasing context windows, the memory usage of KV cache has become a critical bottleneck during inference.

Quantization

WIKIGENBENCH: Exploring Full-length Wikipedia Generation under Real-World Scenario

1 code implementation28 Feb 2024 Jiebin Zhang, Eugene J. Yu, Qinyu Chen, Chenhao Xiong, Dawei Zhu, Han Qian, Mingbo Song, Weimin Xiong, Xiaoguang Li, Qun Liu, Sujian Li

It presents significant challenges to generate comprehensive and accurate Wikipedia articles for newly emerging events under a real-world scenario.

RAG Retrieval

ConFiguRe: Exploring Discourse-level Chinese Figures of Speech

1 code implementation COLING 2022 Dawei Zhu, Qiusi Zhan, Zhejian Zhou, YiFan Song, Jiebin Zhang, Sujian Li

Different from previous token-level or sentence-level counterparts, ConFiguRe aims at extracting a figurative unit from discourse-level context, and classifying the figurative unit into the right figure type.

Natural Language Understanding Sentence

Cannot find the paper you are looking for? You can Submit a new open access paper.