no code implementations • 5 Nov 2019 • Pavel Sulimov, Elena Sukmanova, Roman Chereshnev, Attila Kertesz-Farkas
Training of deep models for classification tasks is hindered by local minima problems and vanishing gradients, while unsupervised layer-wise pretraining does not exploit information from class labels.