no code implementations • 26 Sep 2024 • Elliot L. Epstein, Kaisheng Yao, Jing Li, Xinyi Bai, Hamid Palangi
When all the instructions are also appended to the end of the model input context, the $\operatorname{PIF}$ metric improves by 22. 3 points on average, showing that the challenge with the task lies not only in following the instructions, but also in retrieving the instructions spread out in the model context.
1 code implementation • 13 Feb 2023 • Daniel Y. Fu, Elliot L. Epstein, Eric Nguyen, Armin W. Thomas, Michael Zhang, Tri Dao, Atri Rudra, Christopher Ré
We find that a key requirement to achieving high performance is keeping the convolution kernels smooth.