1 code implementation • 28 Jun 2021 • Yuanfeng Ji, Ruimao Zhang, Huijie Wang, Zhen Li, Lingyun Wu, Shaoting Zhang, Ping Luo
The recent vision transformer(i. e. for image classification) learns non-local attentive interaction of different patch tokens.