7 code implementations • 24 Jan 2022 • Asher Trockman, J. Zico Kolter
Despite its simplicity, we show that the ConvMixer outperforms the ViT, MLP-Mixer, and some of their variants for similar parameter counts and data set sizes, in addition to outperforming classical vision models such as the ResNet.
Ranked #80 on
Image Classification
on CIFAR-10
1 code implementation • ICLR 2021 • Asher Trockman, J. Zico Kolter
Recent work has highlighted several advantages of enforcing orthogonality in the weight layers of deep networks, such as maintaining the stability of activations, preserving gradient norms, and enhancing adversarial robustness by enforcing low Lipschitz constants.