no code implementations • 19 Nov 2021 • Jaesin Ahn, Jiuk Hong, Jeongwoo Ju, Heechul Jung
The proposed method achieved $71. 4\%$ with a few parameters (of $3. 1M$) on the ImageNet-1k dataset compared to that required by the original transformer model of XCiT-N12 ($69. 9\%$).