Constructing Orthogonal Convolutions in an Explicit Manner
The convolution with orthogonal input-output Jacobian matrix, \emph{i.e.}, orthogonal convolution, has attracted substantial attention due to their excellent properties. A convolution layer with an orthogonal Jacobian matrix is 1-Lipschitz in the 2-norm, making the output robust to the perturbation in input. Meanwhile, an orthogonal Jacobian matrix preserves the gradient norm in back-propagation, which is critical for stable training deep networks. Nevertheless, existing orthogonal convolutions are burdened by high computational costs for preserving orthogonality. In this work, we exploit the relation between the singular values of the convolution layer's Jacobian and the structure of the convolution kernel. To achieve the orthogonality, we explicitly construct the convolution kernel for enforcing all singular values of the convolution layer's Jacobian to be $1$s. After training, the explicitly constructed orthogonal (ECO) convolution can be constructed only once, and their weights are stored. Then, in evaluation, we only need to load the stored weights of the trained ECO convolution, and the computational cost of ECO convolution is the same as the standard dilated convolution. It is significantly faster than the recent state-of-the-art approach, skew orthogonal convolution (SOC) in evaluation. Our extensive experiments on CIFAR-10 and CIFAR-100 datasets demonstrate that the proposed ECO convolution is faster than SOC in evaluation while leading to competitive standard and certified robust accuracies.
PDF Abstract