Search Results for author: Victor Ruhle

Found 2 papers, 0 papers with code

Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing

no code implementations22 Apr 2024 Dujian Ding, Ankur Mallick, Chi Wang, Robert Sim, Subhabrata Mukherjee, Victor Ruhle, Laks V. S. Lakshmanan, Ahmed Hassan Awadallah

Large language models (LLMs) excel in most NLP tasks but also require expensive cloud servers for deployment due to their size, while smaller models that can be deployed on lower cost (e. g., edge) devices, tend to lag behind in terms of response quality.

Cannot find the paper you are looking for? You can Submit a new open access paper.