Paper

Correlated Initialization for Correlated Data

Spatial data exhibits the property that nearby points are correlated. This also holds for learnt representations across layers, but not for commonly used weight initialization methods. Our theoretical analysis quantifies the learning behavior of weights of a single spatial filter. It is thus in contrast to a large body of work that discusses statistical properties of weights. It shows that uncorrelated initialization (i) might lead to poor convergence behavior and (ii) training of (some) parameters is likely subject to slow convergence. Empirical analysis shows that these findings for a single spatial filter extend to networks with many spatial filters. The impact of (correlated) initialization depends strongly on learning rates and l2-regularization.

Results in Papers With Code
(↓ scroll down to see all results)