Search Results for author: Asher Trockman

Found 4 papers, 2 papers with code

Patches Are All You Need?

11 code implementations24 Jan 2022 Asher Trockman, J. Zico Kolter

Despite its simplicity, we show that the ConvMixer outperforms the ViT, MLP-Mixer, and some of their variants for similar parameter counts and data set sizes, in addition to outperforming classical vision models such as the ResNet.

Image Classification

Orthogonalizing Convolutional Layers with the Cayley Transform

1 code implementation ICLR 2021 Asher Trockman, J. Zico Kolter

Recent work has highlighted several advantages of enforcing orthogonality in the weight layers of deep networks, such as maintaining the stability of activations, preserving gradient norms, and enhancing adversarial robustness by enforcing low Lipschitz constants.

Adversarial Robustness

Understanding the Covariance Structure of Convolutional Filters

no code implementations7 Oct 2022 Asher Trockman, Devin Willmott, J. Zico Kolter

In this work, we first observe that such learned filters have highly-structured covariance matrices, and moreover, we find that covariances calculated from small networks may be used to effectively initialize a variety of larger networks of different depths, widths, patch sizes, and kernel sizes, indicating a degree of model-independence to the covariance structure.

Mimetic Initialization of Self-Attention Layers

no code implementations16 May 2023 Asher Trockman, J. Zico Kolter

It is notoriously difficult to train Transformers on small datasets; typically, large pre-trained models are instead used as the starting point.

Cannot find the paper you are looking for? You can Submit a new open access paper.