1 code implementation • 8 Mar 2024 • Hao Kang, Qingru Zhang, Souvik Kundu, Geonhwa Jeong, Zaoxing Liu, Tushar Krishna, Tuo Zhao
Key-value (KV) caching has become the de-facto to accelerate generation speed for large language models (LLMs) inference.
no code implementations • 5 Nov 2019 • Zaoxing Liu, Tian Li, Virginia Smith, Vyas Sekar
Federated learning methods run training tasks directly on user devices and do not share the raw user data with third parties.
no code implementations • 3 Nov 2019 • Tian Li, Zaoxing Liu, Vyas Sekar, Virginia Smith
Many existing works treat these concerns separately.