Accelerating Deep Learning with Millions of Classes

Abstract.Deep learning has achieved remarkable success in many classification tasks because of its great power of representation learning for complex data. However, it remains challenging when extending to classification tasks with millions of classes. Previous studies are focused on solving this problem in a distributed fashion or using a sampling-based approach to reduce the computational cost caused by the softmax layer.However, these approaches still need high GPU memory in order to work with large models and it is non-trivial to extend them to parallel settings.To address these issues, we propose an efficient training framework to handle extreme classification tasks based onRandom Projection. The key idea is that we first train a slimmed model with a random projected softmax classifier and then we recover it to the original version. We also show a theoretical guarantee that this recovered classifier can approximate the original classifier with a small error. Later, we extend our framework to parallel scenarios by adopting a communication reduction technique. In our experiment, we demonstrate that the proposed frame-work is able to train deep learning models with millions of classes and achieve above 10×speedup compared to existing approaches.

PDF Abstract

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here