1 code implementation • 7 Jun 2023 • Nikhil Kandpal, Brian Lester, Mohammed Muqeeth, Anisha Mascarenhas, Monty Evans, Vishal Baskaran, Tenghao Huang, Haokun Liu, Colin Raffel
Currently, most machine learning models are trained by centralized teams and are rarely updated.
no code implementations • 6 Jun 2023 • Mohammed Muqeeth, Haokun Liu, Colin Raffel
To address this issue, we introduce Soft Merging of Experts with Adaptive Routing (SMEAR), which avoids discrete routing by using a single "merged" expert constructed via a weighted average of all of the experts' parameters.
2 code implementations • 11 May 2022 • Haokun Liu, Derek Tam, Mohammed Muqeeth, Jay Mohta, Tenghao Huang, Mohit Bansal, Colin Raffel
ICL incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made.
Ranked #1 on
Few-Shot Text Classification
on RAFT