|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
By deploying these models to an Android application on a smartphone, we quantitatively observe that REST allows models to achieve up to 17x energy reduction and 9x faster inference.
Deep neural networks are typically too computationally expensive to run in real-time on consumer-grade hardware and low-powered devices.
Weight and activation binarization is an effective approach to deep neural network compression and can accelerate the inference by leveraging bitwise operations.
2) Cross-layer filter comparison is unachievable since the importance is defined locally within each layer.
Before computing the gradients for each weight update, targeted dropout stochastically selects a set of units or weights to be dropped using a simple self-reinforcing sparsity criterion and then computes the gradients for the remaining weights.