Search Results for author: Youngmock Cho

Found 1 papers, 0 papers with code

MoNDE: Mixture of Near-Data Experts for Large-Scale Sparse Models

no code implementations29 May 2024 Taehyun Kim, Kwanseok Choi, Youngmock Cho, Jaehoon Cho, Hyuk-Jae Lee, Jaewoong Sim

Mixture-of-Experts (MoE) large language models (LLM) have memory requirements that often exceed the GPU memory capacity, requiring costly parameter movement from secondary memories to the GPU for expert computation.

Decoder

Cannot find the paper you are looking for? You can Submit a new open access paper.