Understanding how the statistical and geometric properties of neural activations relate to network performance is a key problem in theoretical neuroscience and deep learning.
Understanding the asymptotic behavior of gradient-descent training of deep neural networks is essential for revealing inductive biases and improving network performance.
Adversarial examples are often cited by neuroscientists and machine learning researchers as an example of how computational models diverge from biological sensory systems.
In conclusion, divisive normalization enhances image recognition performance, most strongly when combined with canonical normalization, and in doing so it reduces manifold capacity and sparsity in early layers while increasing them in final layers, and increases low- or mid-wavelength power in the first-layer receptive fields.
Motivated by a recent study on learning robustness without input perturbations by distilling an AT model, we explore what is learned during adversarial training by analyzing the distribution of logits in AT models.
In this thesis, we generalize Gardner's analysis and establish a theory of linear classification of manifolds synthesizing statistical and geometric properties of high dimensional signals.
Understanding how large neural networks avoid memorizing training data is key to explaining their high generalization performance.
While vector-based language representations from pretrained language models have set a new standard for many NLP tasks, there is not yet a complete accounting of their inner workings.
One approach to addressing this challenge is to utilize mathematical and computational tools to analyze the geometry of these high-dimensional representations, i. e., neural population geometry.
Importing from computational and cognitive neuroscience the notion of representational invariance, we perform a series of probes designed to test the sensitivity of Transformer representations to several kinds of structure in sentences.
In this work, we investigate the latter by juxtaposing experimental results regarding the covariance spectrum of neural representations in the mouse V1 (Stringer et al) with artificial neural networks.
In addition, we find that the emergence of linear separability in these manifolds is driven by a combined reduction of manifolds' radius, dimensionality and inter-manifold correlations.
Higher level concepts such as parts-of-speech and context dependence also emerge in the later layers of the network.
The success of deep neural networks in visual tasks have motivated recent theoretical and empirical work to understand how these networks operate.
The effects of label sparsity on the classification capacity of manifolds are elucidated, revealing a scaling relation between label sparsity and manifold radius.
We consider the problem of classifying data manifolds where each manifold represents invariances that are parameterized by continuous degrees of freedom.
Objects are represented in sensory systems by continuous manifolds due to sensitivity of neuronal responses to changes in physical features such as location, orientation, and intensity.