Search Results for author: Mohammed Muqeeth

Found 5 papers, 3 papers with code

A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning

no code implementations13 Aug 2024 Prateek Yadav, Colin Raffel, Mohammed Muqeeth, Lucas Caccia, Haokun Liu, Tianlong Chen, Mohit Bansal, Leshem Choshen, Alessandro Sordoni

The availability of performant pre-trained models has led to a proliferation of fine-tuned expert models that are specialized to a particular domain or task.

Mixture-of-Experts Survey

Learning to Route Among Specialized Experts for Zero-Shot Generalization

3 code implementations8 Feb 2024 Mohammed Muqeeth, Haokun Liu, Yufan Liu, Colin Raffel

Unlike past methods that learn to route among specialized models, PHATGOOSE explores the possibility that zero-shot generalization will be improved if different experts can be adaptively chosen for each token and at each layer in the model.

parameter-efficient fine-tuning Zero-shot Generalization

Soft Merging of Experts with Adaptive Routing

no code implementations6 Jun 2023 Mohammed Muqeeth, Haokun Liu, Colin Raffel

To address this issue, we introduce Soft Merging of Experts with Adaptive Routing (SMEAR), which avoids discrete routing by using a single "merged" expert constructed via a weighted average of all of the experts' parameters.

Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning

2 code implementations11 May 2022 Haokun Liu, Derek Tam, Mohammed Muqeeth, Jay Mohta, Tenghao Huang, Mohit Bansal, Colin Raffel

ICL incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made.

Few-Shot Text Classification In-Context Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.