Search Results for author: Yechen Xu

Found 3 papers, 2 papers with code

Conveyor: Efficient Tool-aware LLM Serving with Tool Partial Execution

1 code implementation29 May 2024 Yechen Xu, Xinhao Kong, Tingjun Chen, Danyang Zhuo

In this paper, we identify a new opportunity for efficient LLM serving for requests that trigger tools: tool partial execution alongside LLM decoding.

Language Modelling Large Language Model

Cannot find the paper you are looking for? You can Submit a new open access paper.