1 code implementation • 24 Oct 2020 • Jiayi Wang, Shiqiang Wang, Rong-Rong Chen, Mingyue Ji
Furthermore, we extend our analytical approach based on "upward" and "downward" divergences to study the convergence for the general case of H-SGD with more than two levels, where the "sandwich behavior" still holds.