[Re] Training Binary Neural Networks using the Bayesian Learning Rule

RC 2020 · Prateek Garg, Lakshya Singhal, Ashish Sardana ·

Reproduction study for Training Binary Neural Networks using the Bayesian Learning Rule

Scope of Reproducibility
We try to verify the performance of our re-implementation of the BayesBiNN optimizer on various classification and regression benchmarks. We also implemented the STE optimizer which was the central baseline model used in the paper. Finally, we tried to evaluate the results of BayesBiNN on the continual learning benchmark to get a better insight.

Methodology
We developed our separate code-base, consisting of an end-to-end trainer with a Keraslike interface, for the reproduction which includes the implementation of the BayesBiNN and STE optimizer. We did refer to the authorʼs code open-sourced on GitHub to get some insights about the hyperparameters and other doubts that emerged during code development.

Results
We reproduced the accuracy of the BayesBiNN optimizer within less than 0.5% of the originally reported value, which upholds the conclusion that it performs nearly as well as its full precision counterpart in classification tasks. When we tried this in a semantic segmentation context, we found that the results were very underwhelming and in contrast with the seemingly good results by the STE optimizer even with much hyperparameter tuning. We can conclude that, like other Bayesian methods, it is difficult to train BayesBiNN on more complex tasks.

What was easy
After we worked out the mathematics behind the BayesBiNN approach, we developed a pseudo-code for the optimization process which along with references from the authorʼs code, helped us a lot in our reproduction study.

What was difficult
Some of the hyperparameters were not mentioned by the authors in their paper so it was difficult to approximate the values of those parameters. The lack of resources was the next big difficulty that we faced.

Communication with original authors
We had a very fruitful conversation with the authors, which helped us in better understanding the BayesBiNN approach and its extension to the segmentation domain. The detailed pointers are given at the end of this report.