|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
To overcome the bandwidth issues of shuttling parameters to and from EPS, the model is executed a layer at a time across many micro-batches instead of the conventional method of minibatches over whole model.
We automate HW-CNN codesign using NAS by including parameters from both the CNN model and the HW accelerator, and we jointly search for the best model-accelerator pair that boosts accuracy and efficiency.
Neural Architecture Search (NAS) has demonstrated its power on various AI accelerating platforms such as Field Programmable Gate Arrays (FPGAs) and Graphic Processing Units (GPUs).
Due to increasing privacy concerns, neural network (NN) based secure inference (SI) schemes that simultaneously hide the client inputs and server models attract major research interests.
Despite recent progress, the problem of approximating the full Pareto front accurately and efficiently remains challenging.
In this paper, we propose a Broad version for ENAS (BENAS) to solve the above issue, by learning broad architecture whose propagation speed is fast with reinforcement learning and parameter sharing used in ENAS, thereby achieving a higher search efficiency.
We propose a specific search space based on encoder-decoder framework and apply neural architecture search (NAS) to retinal vessel segmentation.
The core of latency prediction is to encode each network architecture and feed it into a multi-layer regressor, with the training data being collected from randomly sampling a number of architectures and evaluating them on the hardware.
Computer simulations are invaluable tools for scientific discovery.