6 code implementations • 12 Mar 2015 • Jascha Sohl-Dickstein, Eric A. Weiss, Niru Maheswaranathan, Surya Ganguli
A central problem in machine learning involves modeling complex data-sets using highly flexible families of probability distributions in which learning, sampling, inference, and evaluation are still analytically or computationally tractable.
no code implementations • NeurIPS 2016 • Lane T. McIntosh, Niru Maheswaranathan, Aran Nayebi, Surya Ganguli, Stephen A. Baccus
Here we demonstrate that deep convolutional neural networks (CNNs) capture retinal responses to natural scenes nearly to within the variability of a cell's response, and are markedly more accurate than linear-nonlinear (LN) models and Generalized Linear Models (GLMs).
1 code implementation • ICML 2017 • Olga Wichrowska, Niru Maheswaranathan, Matthew W. Hoffman, Sergio Gomez Colmenarejo, Misha Denil, Nando de Freitas, Jascha Sohl-Dickstein
Two of the primary barriers to its adoption are an inability to scale to larger problems and a limited ability to generalize to new tasks.
no code implementations • 28 Nov 2017 • Lane McIntosh, Niru Maheswaranathan, David Sussillo, Jonathon Shlens
Importantly, the RNN may be deployed across a range of computational budgets by merely running the model for a variable number of iterations.
2 code implementations • ICLR 2019 • Luke Metz, Niru Maheswaranathan, Brian Cheung, Jascha Sohl-Dickstein
Specifically, we target semi-supervised classification performance, and we meta-learn an algorithm -- an unsupervised weight update rule -- that produces representations useful for this task.
1 code implementation • ICLR 2019 • Niru Maheswaranathan, Luke Metz, George Tucker, Dami Choi, Jascha Sohl-Dickstein
We propose Guided Evolutionary Strategies, a method for optimally using surrogate gradient directions along with random search.
1 code implementation • 24 Oct 2018 • Luke Metz, Niru Maheswaranathan, Jeremy Nixon, C. Daniel Freeman, Jascha Sohl-Dickstein
Deep learning has shown that learned functions can dramatically outperform hand-designed functions on perceptual tasks.
no code implementations • ICLR 2019 • Luke Metz, Niru Maheswaranathan, Brian Cheung, Jascha Sohl-Dickstein
Here, our desired task (meta-objective) is the performance of the representation on semi-supervised classification, and we meta-learn an algorithm -- an unsupervised weight update rule -- that produces representations that perform well under this meta-objective.
no code implementations • ICLR 2019 • Niru Maheswaranathan, Luke Metz, George Tucker, Dami Choi, Jascha Sohl-Dickstein
This arises when an approximate gradient is easier to compute than the full gradient (e. g. in meta-learning or unrolled optimization), or when a true gradient is intractable and is replaced with a surrogate (e. g. in certain reinforcement learning applications or training networks with discrete variables).
no code implementations • ICLR 2019 • Luke Metz, Niru Maheswaranathan, Jeremy Nixon, Daniel Freeman, Jascha Sohl-Dickstein
We demonstrate these results on problems where our learned optimizer trains convolutional networks in a fifth of the wall-clock time compared to tuned first-order methods, and with an improvement
no code implementations • ICML Workshop Deep_Phenomen 2019 • Niru Maheswaranathan, Alex H. Williams, Matthew D. Golub, Surya Ganguli, David Sussillo
Recurrent neural networks (RNNs) are a powerful tool for modeling sequential data.
no code implementations • 8 Jun 2019 • Luke Metz, Niru Maheswaranathan, Jonathon Shlens, Jascha Sohl-Dickstein, Ekin D. Cubuk
State-of-the art vision models can achieve superhuman performance on image classification tasks when testing and training data come from the same distribution.
no code implementations • NeurIPS 2019 • Niru Maheswaranathan, Alex Williams, Matthew D. Golub, Surya Ganguli, David Sussillo
In this work, we use tools from dynamical systems analysis to reverse engineer recurrent networks trained to perform sentiment classification, a foundational natural language processing task.
no code implementations • NeurIPS 2019 • Niru Maheswaranathan, Alex H. Williams, Matthew D. Golub, Surya Ganguli, David Sussillo
To address these foundational questions, we study populations of thousands of networks, with commonly used RNN architectures, trained to solve neuroscientifically motivated tasks and characterize their nonlinear dynamics.
no code implementations • NeurIPS Workshop Neuro_AI 2019 • Hidenori Tanaka, Aran Nayebi, Niru Maheswaranathan, Lane McIntosh, Stephen A. Baccus, Surya Ganguli
Thus overall, this work not only yields insights into the computational mechanisms underlying the striking predictive capabilities of the retina, but also places the framework of deep networks as neuroscientific models on firmer theoretical foundations, by providing a new roadmap to go beyond comparing neural representations to extracting and understand computational mechanisms.
1 code implementation • NeurIPS 2019 • Hidenori Tanaka, Aran Nayebi, Niru Maheswaranathan, Lane McIntosh, Stephen A. Baccus, Surya Ganguli
Thus overall, this work not only yields insights into the computational mechanisms underlying the striking predictive capabilities of the retina, but also places the framework of deep networks as neuroscientific models on firmer theoretical foundations, by providing a new roadmap to go beyond comparing neural representations to extracting and understand computational mechanisms.
no code implementations • 27 Feb 2020 • Luke Metz, Niru Maheswaranathan, Ruoxi Sun, C. Daniel Freeman, Ben Poole, Jascha Sohl-Dickstein
We present TaskSet, a dataset of tasks for use in training and evaluating optimizers.
1 code implementation • ICML 2020 • Niru Maheswaranathan, David Sussillo
Here, we propose general methods for reverse engineering recurrent neural networks (RNNs) to identify and elucidate contextual processing.
no code implementations • 23 Sep 2020 • Luke Metz, Niru Maheswaranathan, C. Daniel Freeman, Ben Poole, Jascha Sohl-Dickstein
In this work we focus on general-purpose learned optimizers capable of training a wide variety of problems with no user-specified hyperparameters.
1 code implementation • ICLR 2021 • Kyle Aitken, Vinay V. Ramasesh, Ankush Garg, Yuan Cao, David Sussillo, Niru Maheswaranathan
Using tools from dynamical systems analysis, we study recurrent networks trained on a battery of both natural and synthetic text classification tasks.
no code implementations • NeurIPS 2021 • Niru Maheswaranathan, David Sussillo, Luke Metz, Ruoxi Sun, Jascha Sohl-Dickstein
Learned optimizers are algorithms that can themselves be trained to solve optimization problems.
1 code implementation • 1 Jan 2021 • Luke Metz, Niru Maheswaranathan, Ruoxi Sun, C. Daniel Freeman, Ben Poole, Jascha Sohl-Dickstein
We present TaskSet, a dataset of tasks for use in training and evaluating optimizers.
no code implementations • 1 Jan 2021 • Luke Metz, Niru Maheswaranathan, C. Daniel Freeman, Ben Poole, Jascha Sohl-Dickstein
In this work we focus on general-purpose learned optimizers capable of training a wide variety of problems with no user-specified hyperparameters.
no code implementations • 14 Jan 2021 • Luke Metz, C. Daniel Freeman, Niru Maheswaranathan, Jascha Sohl-Dickstein
We show that a population of randomly initialized learned optimizers can be used to train themselves from scratch in an online fashion, without resorting to a hand designed optimizer in any part of the process.
no code implementations • NeurIPS 2021 • Kyle Aitken, Vinay V Ramasesh, Yuan Cao, Niru Maheswaranathan
Moreover, how these mechanisms vary depending on the particular architecture used for the encoder and decoder (recurrent, feed-forward, etc.)
1 code implementation • 22 Mar 2022 • Luke Metz, C. Daniel Freeman, James Harrison, Niru Maheswaranathan, Jascha Sohl-Dickstein
We further leverage our analysis to construct a learned optimizer that is both faster and more memory efficient than previous work.