2 code implementations • 7 Apr 2024 • Zihan Liu, Hanyi Wang, Yaoyu Kang, Shilin Wang
Remarkably, our best-performing ViT-L/14 variant requires training only 0. 08% of its parameters to surpass the leading baseline by +3. 64% mAP and +12. 72% avg. Acc across unseen diffusion and autoregressive models.