CompOFA – Compound Once-For-All Networks for Faster Multi-Platform Deployment

The emergence of CNNs in mainstream deployment has necessitated methods to design and train efficient architectures tailored to maximize the accuracy under diverse hardware & latency constrains. To scale these resource-intensive tasks with an increasing number of deployment targets, Once-For-All (OFA) proposed an approach to jointly train several models at once with a constant training cost. However, this cost remains as high as $O(10^3)$ GPU hours and also suffers from a combinatorial explosion of potentially sub-optimal model configurations. We find that the cost of this one-shot training is dependent on the size of the model design space, and hence seek to speed up the training by constraining the design space to configurations with better accuracy-latency trade-offs. We incorporate the insights of compound relationships between model depth & width to build CompOFA, a design space smaller by several orders of magnitude. We demonstrate a 50% cut in the training time, dollar cost, and CO2 emissions over OFA due to the decreased interference in training this smaller design space. We also show that this smaller design space is dense enough to support equally accurate models for similar diversity of hardware and latency targets, as well as simplifying training and searching procedures.

PDF Abstract

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here