2 code implementations • 10 Apr 2024 • Jie Ou, Yueming Chen, Wenhong Tian
While Large Language Models (LLMs) have shown remarkable abilities, they are hindered by significant resource consumption and considerable latency due to autoregressive processing.