A simple connection from loss flatness to compressed representations in neural networks

3 Oct 2023  ·  Shirui Chen, Stefano Recanatesi, Eric Shea-Brown ·

The generalization capacity of deep neural networks has been studied in a variety of ways, including at least two distinct categories of approach: one based on the shape of the loss landscape in parameter space, and the other based on the structure of the representation manifold in feature space (that is, in the space of unit activities). Although these two approaches are related, they are rarely studied together in an explicit connection. Here, we present a simple analysis that makes such a connection. We show that, in the last phase of learning of deep neural networks, compression of the manifold of neural representations correlates with the flatness of the loss around the minima explored by SGD. We show that this is predicted by a relatively simple mathematical relationship: a flatter loss corresponds to a lower upper-bound on the compression of neural representations. Our results closely build on the prior work of Ma and Ying, who demonstrated how flatness, characterized by small eigenvalues of the loss Hessian, develops in late learning phases and contributes to robustness against perturbations in network inputs. Moreover, we show a lack of a similarly direct connection between local dimensionality and sharpness, suggesting that this property may be controlled by different mechanisms than volume and hence may play a complementary role in neural representations. Overall, we advance a dual perspective on generalization in neural networks in both parameter and feature space.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods