1 code implementation • 28 Feb 2024 • Kaiyue Wen, Xingyu Dang, Kaifeng Lyu
This paper investigates the gap in representation powers of Recurrent Neural Networks (RNNs) and Transformers in the context of solving algorithmic problems.
5 code implementations • 1 Jun 2023 • Ji Lin, Jiaming Tang, Haotian Tang, Shang Yang, Wei-Ming Chen, Wei-Chen Wang, Guangxuan Xiao, Xingyu Dang, Chuang Gan, Song Han
We then propose to search for the optimal per-channel scaling that protects the salient weights by observing the activation, not weights.