no code implementations • 9 May 2023 • Jetze T. Schuurmans, Kim Batselier, Julian F. P. Kooij
While scaling the approximation error commonly is used to account for the different sizes of layers, the average correlation across layers is smaller than across all choices (i. e. layers, decompositions, and level of compression) before fine-tuning.