Paper

Neuroevolution-Enhanced Multi-Objective Optimization for Mixed-Precision Quantization

Mixed-precision quantization is a powerful tool to enable memory and compute savings of neural network workloads by deploying different sets of bit-width precisions on separate compute operations. In this work, we present a flexible and scalable framework for automated mixed-precision quantization that concurrently optimizes task performance, memory compression, and compute savings through multi-objective evolutionary computing. Our framework centers on Neuroevolution-Enhanced Multi-Objective Optimization (NEMO), a novel search method, which combines established search methods with the representational power of neural networks. Within NEMO, the population is divided into structurally distinct sub-populations, or species, which jointly create the Pareto frontier of solutions for the multi-objective problem. At each generation, species perform separate mutation and crossover operations, and are re-sized in proportion to the goodness of their contribution to the Pareto frontier. In our experiments, we define a graph-based representation to describe the underlying workload, enabling us to deploy graph neural networks trained by NEMO via neuroevolution, to find Pareto optimal configurations for MobileNet-V2, ResNet50 and ResNeXt-101-32x8d. Compared to the state-of-the-art, we achieve competitive results on memory compression and superior results for compute compression. Further analysis reveals that the graph representation and the species-based approach employed by NEMO are critical to finding optimal solutions.

Results in Papers With Code
(↓ scroll down to see all results)