CompOFA – Compound Once-For-All Networks for Faster Multi-Platform Deployment

ICLR 2021 · Manas Sahni, Shreya Varshini, Alind Khare, Alexey Tumanov ·

The emergence of CNNs in mainstream deployment has necessitated methods to design and train efficient architectures tailored to maximize the accuracy under diverse hardware & latency constrains. To scale these resource-intensive tasks with an increasing number of deployment targets, Once-For-All (OFA) proposed an approach to jointly train several models at once with a constant training cost. However, this cost remains as high as $O(10^3)$ GPU hours and also suffers from a combinatorial explosion of potentially sub-optimal model configurations. We find that the cost of this one-shot training is dependent on the size of the model design space, and hence seek to speed up the training by constraining the design space to configurations with better accuracy-latency trade-offs. We incorporate the insights of compound relationships between model depth & width to build CompOFA, a design space smaller by several orders of magnitude. We demonstrate a 50% cut in the training time, dollar cost, and CO2 emissions over OFA due to the decreased interference in training this smaller design space. We also show that this smaller design space is dense enough to support equally accurate models for similar diversity of hardware and latency targets, as well as simplifying training and searching procedures.

PDF Abstract