With the growing demand for deploying deep learning models to the "edge", it is paramount to develop techniques that allow to execute state-of-the-art models within very tight and limited resource constraints.
In this work we propose a method for reducing the computational cost of backprop, which we named dithered backprop.
To address this problem, we propose Entropy-Constrained Trained Ternarization (EC2T), a general framework to create sparse and ternary neural networks which are efficient in terms of storage (e. g., at most two binary-masks and two full-precision values are required to save a weight matrix) and computation (e. g., MAC operations are reduced to a few accumulations plus two multiplications).
The success of convolutional neural networks (CNNs) in various applications is accompanied by a significant increase in computation and parameter storage costs.
1 code implementation • 27 Jul 2019 • Simon Wiedemann, Heiner Kirchoffer, Stefan Matlage, Paul Haase, Arturo Marban, Talmaj Marinc, David Neumann, Tung Nguyen, Ahmed Osman, Detlev Marpe, Heiko Schwarz, Thomas Wiegand, Wojciech Samek
The field of video compression has developed some of the most sophisticated and efficient compression algorithms known in the literature, enabling very high compressibility for little loss of information.
no code implementations • 15 May 2019 • Simon Wiedemann, Heiner Kirchhoffer, Stefan Matlage, Paul Haase, Arturo Marban, Talmaj Marinc, David Neumann, Ahmed Osman, Detlev Marpe, Heiko Schwarz, Thomas Wiegand, Wojciech Samek
We present DeepCABAC, a novel context-adaptive binary arithmetic coder for compressing deep neural networks.
Federated Learning allows multiple parties to jointly train a deep learning model on their combined data, without any of the participants having to reveal their local data to a centralized server.
We propose a general framework for neural network compression that is motivated by the Minimum Description Length (MDL) principle.
These new matrix formats have the novel property that their memory and algorithmic complexity are implicitly bounded by the entropy of the matrix, consequently implying that they are guaranteed to become more efficient as the entropy of the matrix is being reduced.
A major issue in distributed training is the limited communication bandwidth between contributing nodes or prohibitive communication cost in general.