Search Results for author: Surya Ganguli

Found 56 papers, 27 papers with code

Explaining heterogeneity in medial entorhinal cortex with task-driven neural networks

no code implementations NeurIPS 2021 Aran Nayebi, Alexander Attinger, Malcolm Campbell, Kiah Hardcastle, Isabel Low, Caitlin Mallory, Gabriel Mel, Ben Sorscher, Alex Williams, Surya Ganguli, Lisa Giocomo, Dan Yamins

Medial entorhinal cortex (MEC) supports a wide range of navigational and memory related behaviors. Well-known experimental results have revealed specialized cell types in MEC --- e. g. grid, border, and head-direction cells --- whose highly stereotypical response profiles are suggestive of the role they might play in supporting MEC functionality.

Synaptic balancing: a biologically plausible local learning rule that provably increases neural network noise robustness without sacrificing task performance

no code implementations18 Jul 2021 Christopher H. Stock, Sarah E. Harvey, Samuel A. Ocko, Surya Ganguli

We introduce a novel, biologically plausible local learning rule that provably increases the robustness of neural dynamics to noise in nonlinear recurrent neural networks with homogeneous nonlinearities.

Deep Learning on a Data Diet: Finding Important Examples Early in Training

1 code implementation NeurIPS 2021 Mansheej Paul, Surya Ganguli, Gintare Karolina Dziugaite

In this work, we make the striking observation that, on standard vision benchmarks, the initial loss gradient norm of individual training examples, averaged over several weight initializations, can be used to identify a smaller set of training data that is important for generalization.

2D Human Pose Estimation

How many degrees of freedom do we need to train deep networks: a loss landscape perspective

1 code implementation13 Jul 2021 Brett W. Larsen, Stanislav Fort, Nic Becker, Surya Ganguli

A variety of recent works, spanning pruning, lottery tickets, and training within random subspaces, have shown that deep neural networks can be trained using far fewer degrees of freedom than the total number of parameters.

Understanding self-supervised Learning Dynamics without Contrastive Pairs

3 code implementations12 Feb 2021 Yuandong Tian, Xinlei Chen, Surya Ganguli

While contrastive approaches of self-supervised learning (SSL) learn representations by minimizing the distance between two augmented views of the same data point (positive pairs) and maximizing views from different data points (negative pairs), recent \emph{non-contrastive} SSL (e. g., BYOL and SimSiam) show remarkable performance {\it without} negative pairs, with an extra learnable predictor and a stop-gradient operation.

Self-Supervised Learning

Embodied Intelligence via Learning and Evolution

no code implementations3 Feb 2021 Agrim Gupta, Silvio Savarese, Surya Ganguli, Li Fei-Fei

However, the principles governing relations between environmental complexity, evolved morphology, and the learnability of intelligent control, remain elusive, partially due to the substantial challenge of performing large-scale in silico experiments on evolution and learning.

Slice, Dice, and Optimize: Measuring the Dimension of Neural Network Class Manifolds

no code implementations1 Jan 2021 Stanislav Fort, Ekin Dogus Cubuk, Surya Ganguli, Samuel Stern Schoenholz

Deep neural network classifiers naturally partition input space into regions belonging to different classes.

Symmetry, Conservation Laws, and Learning Dynamics in Neural Networks

no code implementations ICLR 2021 Daniel Kunin, Javier Sagastuy-Brena, Surya Ganguli, Daniel LK Yamins, Hidenori Tanaka

Overall, by exploiting symmetry, our work demonstrates that we can analytically describe the learning dynamics of various parameter combinations at finite learning rates and batch sizes for state of the art architectures trained on any dataset.

Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics

1 code implementation8 Dec 2020 Daniel Kunin, Javier Sagastuy-Brena, Surya Ganguli, Daniel L. K. Yamins, Hidenori Tanaka

Overall, by exploiting symmetry, our work demonstrates that we can analytically describe the learning dynamics of various parameter combinations at finite learning rates and batch sizes for state of the art architectures trained on any dataset.

Identifying Learning Rules From Neural Network Observables

2 code implementations NeurIPS 2020 Aran Nayebi, Sanjana Srivastava, Surya Ganguli, Daniel L. K. Yamins

We show that different classes of learning rules can be separated solely on the basis of aggregate statistics of the weights, activations, or instantaneous layer-wise activity changes, and that these results generalize to limited access to the trajectory and held-out architectures and learning curricula.

Understanding Self-supervised Learning with Dual Deep Networks

2 code implementations1 Oct 2020 Yuandong Tian, Lantao Yu, Xinlei Chen, Surya Ganguli

We propose a novel theoretical framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks (e. g., SimCLR).

Self-Supervised Learning

Predictive coding in balanced neural networks with noise, chaos and delays

no code implementations NeurIPS 2020 Jonathan Kadmon, Jonathan Timcheck, Surya Ganguli

However, the theoretical principles governing the efficacy of balanced predictive coding and its robustness to noise, synaptic weight heterogeneity and communication delays remain poorly understood.

Pruning neural networks without any data by iteratively conserving synaptic flow

2 code implementations NeurIPS 2020 Hidenori Tanaka, Daniel Kunin, Daniel L. K. Yamins, Surya Ganguli

Pruning the parameters of deep neural networks has generated intense interest due to potential savings in time, memory and energy both during training and at test time.

Two Routes to Scalable Credit Assignment without Weight Symmetry

1 code implementation ICML 2020 Daniel Kunin, Aran Nayebi, Javier Sagastuy-Brena, Surya Ganguli, Jonathan M. Bloom, Daniel L. K. Yamins

The neural plausibility of backpropagation has long been disputed, primarily for its use of non-local weight transport $-$ the biologically dubious requirement that one neuron instantaneously measure the synaptic weights of another.

From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction

1 code implementation NeurIPS 2019 Hidenori Tanaka, Aran Nayebi, Niru Maheswaranathan, Lane McIntosh, Stephen A. Baccus, Surya Ganguli

Thus overall, this work not only yields insights into the computational mechanisms underlying the striking predictive capabilities of the retina, but also places the framework of deep networks as neuroscientific models on firmer theoretical foundations, by providing a new roadmap to go beyond comparing neural representations to extracting and understand computational mechanisms.

Dimensionality Reduction

A unified theory for the origin of grid cells through the lens of pattern formation

1 code implementation NeurIPS 2019 Ben Sorscher, Gabriel Mel, Surya Ganguli, Samuel Ocko

This theory provides insight into the optimal solutions of diverse formulations of the normative task, and shows that symmetries in the representation of space correctly predict the structure of learned firing fields in trained neural networks.

Emergent properties of the local geometry of neural loss landscapes

no code implementations14 Oct 2019 Stanislav Fort, Surya Ganguli

The local geometry of high dimensional neural network loss landscapes can both challenge our cherished theoretical intuitions as well as dramatically impact the practical success of neural network training.

Revealing computational mechanisms of retinal prediction via model reduction

no code implementations NeurIPS Workshop Neuro_AI 2019 Hidenori Tanaka, Aran Nayebi, Niru Maheswaranathan, Lane McIntosh, Stephen A. Baccus, Surya Ganguli

Thus overall, this work not only yields insights into the computational mechanisms underlying the striking predictive capabilities of the retina, but also places the framework of deep networks as neuroscientific models on firmer theoretical foundations, by providing a new roadmap to go beyond comparing neural representations to extracting and understand computational mechanisms.

Universality and individuality in neural dynamics across large populations of recurrent networks

1 code implementation NeurIPS 2019 Niru Maheswaranathan, Alex H. Williams, Matthew D. Golub, Surya Ganguli, David Sussillo

To address these foundational questions, we study populations of thousands of networks, with commonly used RNN architectures, trained to solve neuroscientifically motivated tasks and characterize their nonlinear dynamics.

Fast Convolutive Nonnegative Matrix Factorization Through Coordinate and Block Coordinate Updates

no code implementations29 Jun 2019 Anthony Degleris, Ben Antin, Surya Ganguli, Alex H. Williams

Identifying recurring patterns in high-dimensional time series data is an important problem in many scientific domains.

Time Series

Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics

no code implementations NeurIPS 2019 Niru Maheswaranathan, Alex Williams, Matthew D. Golub, Surya Ganguli, David Sussillo

In this work, we use tools from dynamical systems analysis to reverse engineer recurrent networks trained to perform sentiment classification, a foundational natural language processing task.

Classification General Classification +1

The emergence of multiple retinal cell types through efficient coding of natural movies

no code implementations NeurIPS 2018 Samuel Ocko, Jack Lindsey, Surya Ganguli, Stephane Deny

Also, we train a nonlinear encoding model with a rectifying nonlinearity to efficiently encode naturalistic movies, and again find emergent receptive fields resembling those of midget and parasol cells that are now further subdivided into ON and OFF types.

Statistical mechanics of low-rank tensor decomposition

1 code implementation NeurIPS 2018 Jonathan Kadmon, Surya Ganguli

Often, large, high dimensional datasets collected across multiple modalities can be organized as a higher order tensor.

Tensor Decomposition

A mathematical theory of semantic development in deep neural networks

1 code implementation23 Oct 2018 Andrew M. Saxe, James L. McClelland, Surya Ganguli

An extensive body of empirical research has revealed remarkable regularities in the acquisition, organization, deployment, and neural representation of human semantic knowledge, thereby raising a fundamental conceptual question: what are the theoretical principles governing the ability of neural networks to acquire, organize, and deploy abstract knowledge by integrating across many individual experiences?

Semantic Similarity Semantic Textual Similarity

An analytic theory of generalization dynamics and transfer learning in deep linear networks

no code implementations ICLR 2019 Andrew K. Lampinen, Surya Ganguli

However we lack analytic theories that can quantitatively predict how the degree of knowledge transfer depends on the relationship between the tasks.

Multi-Task Learning

The Emergence of Spectral Universality in Deep Networks

1 code implementation27 Feb 2018 Jeffrey Pennington, Samuel S. Schoenholz, Surya Ganguli

Recent work has shown that tight concentration of the entire spectrum of singular values of a deep network's input-output Jacobian around one at initialization can speed up learning by orders of magnitude.

Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice

no code implementations NeurIPS 2017 Jeffrey Pennington, Samuel S. Schoenholz, Surya Ganguli

It is well known that the initialization of weights in deep neural networks can have a dramatic impact on learning speed.

Variational Walkback: Learning a Transition Operator as a Stochastic Recurrent Net

1 code implementation NeurIPS 2017 Anirudh Goyal, Nan Rosemary Ke, Surya Ganguli, Yoshua Bengio

The energy function is then modified so the model and data distributions match, with no guarantee on the number of steps required for the Markov chain to converge.

SuperSpike: Supervised learning in multi-layer spiking neural networks

no code implementations31 May 2017 Friedemann Zenke, Surya Ganguli

In summary, our results open the door to obtaining a better scientific understanding of learning and computation in spiking neural networks by advancing our ability to train them to solve nonlinear problems involving transformations between different spatiotemporal spike-time patterns.

Biologically inspired protection of deep networks from adversarial attacks

no code implementations27 Mar 2017 Aran Nayebi, Surya Ganguli

Inspired by biophysical principles underlying nonlinear dendritic computation in neural circuits, we develop a scheme to train deep neural networks to make them robust to adversarial attacks.

Adversarial Attack

Continual Learning Through Synaptic Intelligence

3 code implementations ICML 2017 Friedemann Zenke, Ben Poole, Surya Ganguli

While deep learning has led to remarkable advances across diverse applications, it struggles in domains where the data distribution changes over the course of learning.

Continual Learning General Classification

Deep Learning Models of the Retinal Response to Natural Scenes

no code implementations NeurIPS 2016 Lane T. McIntosh, Niru Maheswaranathan, Aran Nayebi, Surya Ganguli, Stephen A. Baccus

Here we demonstrate that deep convolutional neural networks (CNNs) capture retinal responses to natural scenes nearly to within the variability of a cell's response, and are markedly more accurate than linear-nonlinear (LN) models and Generalized Linear Models (GLMs).

Survey of Expressivity in Deep Neural Networks

no code implementations24 Nov 2016 Maithra Raghu, Ben Poole, Jon Kleinberg, Surya Ganguli, Jascha Sohl-Dickstein

This quantity grows exponentially in the depth of the network, and is responsible for the depth sensitivity observed.

Deep Information Propagation

1 code implementation4 Nov 2016 Samuel S. Schoenholz, Justin Gilmer, Surya Ganguli, Jascha Sohl-Dickstein

We show the existence of depth scales that naturally limit the maximum depth of signal propagation through these random networks.

An equivalence between high dimensional Bayes optimal inference and M-estimation

no code implementations NeurIPS 2016 Madhu Advani, Surya Ganguli

In this work we demonstrate, when the signal distribution and the likelihood function associated with the noise are both log-concave, that optimal MMSE performance is asymptotically achievable via another M-estimation procedure.

Random projections of random manifolds

no code implementations14 Jul 2016 Subhaneil Lahiri, Peiran Gao, Surya Ganguli

Moreover, unlike previous work, we test our theoretical bounds against numerical experiments on the actual geometric distortions that typically occur for random projections of random smooth manifolds.

Dimensionality Reduction

On the Expressive Power of Deep Neural Networks

no code implementations ICML 2017 Maithra Raghu, Ben Poole, Jon Kleinberg, Surya Ganguli, Jascha Sohl-Dickstein

We propose a new approach to the problem of neural network expressivity, which seeks to characterize how structural properties of a neural network family affect the functions it is able to compute.

Exponential expressivity in deep neural networks through transient chaos

1 code implementation NeurIPS 2016 Ben Poole, Subhaneil Lahiri, Maithra Raghu, Jascha Sohl-Dickstein, Surya Ganguli

We combine Riemannian geometry with the mean field theory of high dimensional chaos to study the nature of signal propagation in generic, deep neural networks with random weights.

A universal tradeoff between power, precision and speed in physical communication

no code implementations24 Mar 2016 Subhaneil Lahiri, Jascha Sohl-Dickstein, Surya Ganguli

Maximizing the speed and precision of communication while minimizing power dissipation is a fundamental engineering design goal.

Statistical Mechanics of High-Dimensional Inference

no code implementations18 Jan 2016 Madhu Advani, Surya Ganguli

Our analysis uncovers fundamental limits on the accuracy of inference in high dimensions, and reveals that widely cherished inference algorithms like maximum likelihood (ML) and maximum-a posteriori (MAP) inference cannot achieve these limits.

Bayesian Inference

Deep Knowledge Tracing

6 code implementations NeurIPS 2015 Chris Piech, Jonathan Spencer, Jonathan Huang, Surya Ganguli, Mehran Sahami, Leonidas Guibas, Jascha Sohl-Dickstein

Knowledge tracing---where a machine models the knowledge of a student as they interact with coursework---is a well established problem in computer supported education.

Knowledge Tracing

Deep Unsupervised Learning using Nonequilibrium Thermodynamics

3 code implementations12 Mar 2015 Jascha Sohl-Dickstein, Eric A. Weiss, Niru Maheswaranathan, Surya Ganguli

A central problem in machine learning involves modeling complex data-sets using highly flexible families of probability distributions in which learning, sampling, inference, and evaluation are still analytically or computationally tractable.

Identifying and attacking the saddle point problem in high-dimensional non-convex optimization

3 code implementations NeurIPS 2014 Yann Dauphin, Razvan Pascanu, Caglar Gulcehre, Kyunghyun Cho, Surya Ganguli, Yoshua Bengio

Gradient descent or quasi-Newton methods are almost ubiquitously used to perform such minimizations, and it is often thought that a main source of difficulty for these local methods to find the global minimum is the proliferation of local minima with much higher error than the global minimum.

Analyzing noise in autoencoders and deep networks

no code implementations6 Jun 2014 Ben Poole, Jascha Sohl-Dickstein, Surya Ganguli

Autoencoders have emerged as a useful framework for unsupervised learning of internal representations, and a wide variety of apparently conceptually disparate regularization techniques have been proposed to generate useful features.

Denoising

On the saddle point problem for non-convex optimization

no code implementations19 May 2014 Razvan Pascanu, Yann N. Dauphin, Surya Ganguli, Yoshua Bengio

Gradient descent or quasi-Newton methods are almost ubiquitously used to perform such minimizations, and it is often thought that a main source of difficulty for the ability of these local methods to find the global minimum is the proliferation of local minima with much higher error than the global minimum.

Exact solutions to the nonlinear dynamics of learning in deep linear neural networks

3 code implementations20 Dec 2013 Andrew M. Saxe, James L. McClelland, Surya Ganguli

We further exhibit a new class of random orthogonal initial conditions on weights that, like unsupervised pre-training, enjoys depth independent learning times.

Unsupervised Pre-training

A memory frontier for complex synapses

no code implementations NeurIPS 2013 Subhaneil Lahiri, Surya Ganguli

An incredible gulf separates theoretical models of synapses, often described solely by a single scalar value denoting the size of a postsynaptic potential, from the immense complexity of molecular signaling pathways underlying real synapses.

Fast large-scale optimization by unifying stochastic gradient and quasi-Newton methods

1 code implementation9 Nov 2013 Jascha Sohl-Dickstein, Ben Poole, Surya Ganguli

This algorithm contrasts with earlier stochastic second order techniques that treat the Hessian of each contributing function as a noisy approximation to the full Hessian, rather than as a target for direct estimation.

Short-term memory in neuronal networks through dynamical compressed sensing

no code implementations NeurIPS 2010 Surya Ganguli, Haim Sompolinsky

Prior work, in the case of gaussian input sequences and linear neuronal networks, shows that the duration of memory traces in a network cannot exceed the number of neurons (in units of the neuronal time constant), and that no network can out-perform an equivalent feedforward network.

Cannot find the paper you are looking for? You can Submit a new open access paper.