no code implementations • ICLR 2019 • Utku Evci, Nicolas Le Roux, Pablo Castro, Leon Bottou
Finally, we show that the units selected by the best performing scoring functions are somewhat consistent over the course of training, implying the dead parts of the network appear during the stages of training.