Search Results for author: Ajay Nayak

Found 1 papers, 0 papers with code

vAttention: Dynamic Memory Management for Serving LLMs without PagedAttention

no code implementations7 May 2024 Ramya Prabhu, Ajay Nayak, Jayashree Mohan, Ramachandran Ramjee, Ashish Panwar

Thus, vAttention unburdens the attention kernel developer from having to explicitly support paging and avoids re-implementation of memory management in the serving framework.

Management

Cannot find the paper you are looking for? You can Submit a new open access paper.