no code implementations • 24 Apr 2024 • Haifeng Qian, Sujan Kumar Gonugondla, Sungsoo Ha, Mingyue Shang, Sanjay Krishna Gouda, Ramesh Nallapati, Sudipta Sengupta, Xiaofei Ma, Anoop Deoras
Speculative decoding has emerged as a powerful method to improve latency and throughput in hosting large language models.
no code implementations • 6 Aug 2018 • Sungsoo Ha, Klaus Mueller
We tested the proposed method with two clinical datasets that were both obtained during spine surgery.