1 code implementation • 24 Oct 2023 • Jialing Pan, Adrien Sadé, Jin Kim, Eric Soriano, Guillem Sole, Sylvain Flamant
Specifically, we use a Low-Rank Adaptive Method (LoRA) technique, limiting each expert size as only 0. 06% of number of StarCoder's parameters.