Our analysis makes it possible to understand how magnitude-based hyperparameters influence the training of binary networks which allows for new optimization filters specifically designed for binary neural networks that are independent of their real-valued interpretation.
We investigate experimentally that equal bit ratios are indeed preferable and show that our method leads to optimization benefits.
We make the observation that pruning weights adds the value 0 as an additional symbol and thus increases the information capacity of the network.
This layer is shown to minimize a penalized term of the Wasserstein distance between the learned continuous image features and the optimal half-half bit distribution.
Current weakly supervised object localization and segmentation rely on class-discriminative visualization techniques to generate pseudo-labels for pixel-level training.
Such methods are less stable than BN as they critically depend on the statistics of a single input sample.