Search Results for author: Niru Maheswaranathan

Found 26 papers, 10 papers with code

Deep Unsupervised Learning using Nonequilibrium Thermodynamics

6 code implementations • 12 Mar 2015 • Jascha Sohl-Dickstein, Eric A. Weiss, Niru Maheswaranathan, Surya Ganguli

A central problem in machine learning involves modeling complex data-sets using highly flexible families of probability distributions in which learning, sampling, inference, and evaluation are still analytically or computationally tractable.

608

Paper
Code

Deep Learning Models of the Retinal Response to Natural Scenes

no code implementations • NeurIPS 2016 • Lane T. McIntosh, Niru Maheswaranathan, Aran Nayebi, Surya Ganguli, Stephen A. Baccus

Here we demonstrate that deep convolutional neural networks (CNNs) capture retinal responses to natural scenes nearly to within the variability of a cell's response, and are markedly more accurate than linear-nonlinear (LN) models and Generalized Linear Models (GLMs).

Paper
Add Code

Learned Optimizers that Scale and Generalize

1 code implementation • ICML 2017 • Olga Wichrowska, Niru Maheswaranathan, Matthew W. Hoffman, Sergio Gomez Colmenarejo, Misha Denil, Nando de Freitas, Jascha Sohl-Dickstein

Two of the primary barriers to its adoption are an inability to scale to larger problems and a limited ability to generalize to new tasks.

Paper
Code

Recurrent Segmentation for Variable Computational Budgets

no code implementations • 28 Nov 2017 • Lane McIntosh, Niru Maheswaranathan, David Sussillo, Jonathon Shlens

Importantly, the RNN may be deployed across a range of computational budgets by merely running the model for a variable number of iterations.

Image Segmentation Segmentation +3

Paper
Add Code

Meta-Learning Update Rules for Unsupervised Representation Learning

2 code implementations • ICLR 2019 • Luke Metz, Niru Maheswaranathan, Brian Cheung, Jascha Sohl-Dickstein

Specifically, we target semi-supervised classification performance, and we meta-learn an algorithm -- an unsupervised weight update rule -- that produces representations useful for this task.

Meta-Learning Representation Learning

76,594

Paper
Code

Guided evolutionary strategies: Augmenting random search with surrogate gradients

1 code implementation • ICLR 2019 • Niru Maheswaranathan, Luke Metz, George Tucker, Dami Choi, Jascha Sohl-Dickstein

We propose Guided Evolutionary Strategies, a method for optimally using surrogate gradient directions along with random search.

Meta-Learning

265

Paper
Code

Understanding and correcting pathologies in the training of learned optimizers

1 code implementation • 24 Oct 2018 • Luke Metz, Niru Maheswaranathan, Jeremy Nixon, C. Daniel Freeman, Jascha Sohl-Dickstein

Deep learning has shown that learned functions can dramatically outperform hand-designed functions on perceptual tasks.

Paper
Code

Learning Unsupervised Learning Rules

no code implementations • ICLR 2019 • Luke Metz, Niru Maheswaranathan, Brian Cheung, Jascha Sohl-Dickstein

Here, our desired task (meta-objective) is the performance of the representation on semi-supervised classification, and we meta-learn an algorithm -- an unsupervised weight update rule -- that produces representations that perform well under this meta-objective.

Meta-Learning

Paper
Add Code

Guided Evolutionary Strategies: Escaping the curse of dimensionality in random search

no code implementations • ICLR 2019 • Niru Maheswaranathan, Luke Metz, George Tucker, Dami Choi, Jascha Sohl-Dickstein

This arises when an approximate gradient is easier to compute than the full gradient (e. g. in meta-learning or unrolled optimization), or when a true gradient is intractable and is replaced with a surrogate (e. g. in certain reinforcement learning applications or training networks with discrete variables).

Meta-Learning

Paper
Add Code

Learned optimizers that outperform on wall-clock and validation loss

no code implementations • ICLR 2019 • Luke Metz, Niru Maheswaranathan, Jeremy Nixon, Daniel Freeman, Jascha Sohl-Dickstein

We demonstrate these results on problems where our learned optimizer trains convolutional networks in a fifth of the wall-clock time compared to tuned first-order methods, and with an improvement

Paper
Add Code

Line attractor dynamics in recurrent networks for sentiment classiﬁcation

no code implementations • ICML Workshop Deep_Phenomen 2019 • Niru Maheswaranathan, Alex H. Williams, Matthew D. Golub, Surya Ganguli, David Sussillo

Recurrent neural networks (RNNs) are a powerful tool for modeling sequential data.

Paper
Add Code

Using learned optimizers to make models robust to input noise

no code implementations • 8 Jun 2019 • Luke Metz, Niru Maheswaranathan, Jonathon Shlens, Jascha Sohl-Dickstein, Ekin D. Cubuk

State-of-the art vision models can achieve superhuman performance on image classification tasks when testing and training data come from the same distribution.

General Classification Image Classification +1

Paper
Add Code

Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics

no code implementations • NeurIPS 2019 • Niru Maheswaranathan, Alex Williams, Matthew D. Golub, Surya Ganguli, David Sussillo

In this work, we use tools from dynamical systems analysis to reverse engineer recurrent networks trained to perform sentiment classification, a foundational natural language processing task.

General Classification Sentiment Analysis +1

Paper
Add Code

Universality and individuality in neural dynamics across large populations of recurrent networks

no code implementations • NeurIPS 2019 • Niru Maheswaranathan, Alex H. Williams, Matthew D. Golub, Surya Ganguli, David Sussillo

To address these foundational questions, we study populations of thousands of networks, with commonly used RNN architectures, trained to solve neuroscientifically motivated tasks and characterize their nonlinear dynamics.

Paper
Add Code

Revealing computational mechanisms of retinal prediction via model reduction

no code implementations • NeurIPS Workshop Neuro_AI 2019 • Hidenori Tanaka, Aran Nayebi, Niru Maheswaranathan, Lane McIntosh, Stephen A. Baccus, Surya Ganguli

Thus overall, this work not only yields insights into the computational mechanisms underlying the striking predictive capabilities of the retina, but also places the framework of deep networks as neuroscientific models on firmer theoretical foundations, by providing a new roadmap to go beyond comparing neural representations to extracting and understand computational mechanisms.

Dimensionality Reduction

Paper
Add Code

From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction

1 code implementation • NeurIPS 2019 • Hidenori Tanaka, Aran Nayebi, Niru Maheswaranathan, Lane McIntosh, Stephen A. Baccus, Surya Ganguli

Dimensionality Reduction

Paper
Code

Using a thousand optimization tasks to learn hyperparameter search strategies

no code implementations • 27 Feb 2020 • Luke Metz, Niru Maheswaranathan, Ruoxi Sun, C. Daniel Freeman, Ben Poole, Jascha Sohl-Dickstein

We present TaskSet, a dataset of tasks for use in training and evaluating optimizers.

General Classification Image Classification +2

Paper
Add Code

How recurrent networks implement contextual processing in sentiment analysis

1 code implementation • ICML 2020 • Niru Maheswaranathan, David Sussillo

Here, we propose general methods for reverse engineering recurrent neural networks (RNNs) to identify and elucidate contextual processing.

Negation Sentiment Analysis +1

362

Paper
Code

Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves

no code implementations • 23 Sep 2020 • Luke Metz, Niru Maheswaranathan, C. Daniel Freeman, Ben Poole, Jascha Sohl-Dickstein

In this work we focus on general-purpose learned optimizers capable of training a wide variety of problems with no user-specified hyperparameters.

Paper
Add Code

The geometry of integration in text classification RNNs

1 code implementation • ICLR 2021 • Kyle Aitken, Vinay V. Ramasesh, Ankush Garg, Yuan Cao, David Sussillo, Niru Maheswaranathan

Using tools from dynamical systems analysis, we study recurrent networks trained on a battery of both natural and synthetic text classification tasks.

General Classification text-classification +1

Paper
Code

Reverse engineering learned optimizers reveals known and novel mechanisms

no code implementations • NeurIPS 2021 • Niru Maheswaranathan, David Sussillo, Luke Metz, Ruoxi Sun, Jascha Sohl-Dickstein

Learned optimizers are algorithms that can themselves be trained to solve optimization problems.

Paper
Add Code

TaskSet: A Dataset of Optimization Tasks

1 code implementation • 1 Jan 2021 • Luke Metz, Niru Maheswaranathan, Ruoxi Sun, C. Daniel Freeman, Ben Poole, Jascha Sohl-Dickstein

We present TaskSet, a dataset of tasks for use in training and evaluating optimizers.

Image Classification Language Modelling +1

32,804

Paper
Code

Overcoming barriers to the training of effective learned optimizers

no code implementations • 1 Jan 2021 • Luke Metz, Niru Maheswaranathan, C. Daniel Freeman, Ben Poole, Jascha Sohl-Dickstein

In this work we focus on general-purpose learned optimizers capable of training a wide variety of problems with no user-specified hyperparameters.

Paper
Add Code

Training Learned Optimizers with Randomly Initialized Learned Optimizers

no code implementations • 14 Jan 2021 • Luke Metz, C. Daniel Freeman, Niru Maheswaranathan, Jascha Sohl-Dickstein

We show that a population of randomly initialized learned optimizers can be used to train themselves from scratch in an online fashion, without resorting to a hand designed optimizer in any part of the process.

Paper
Add Code

Understanding How Encoder-Decoder Architectures Attend

no code implementations • NeurIPS 2021 • Kyle Aitken, Vinay V Ramasesh, Yuan Cao, Niru Maheswaranathan

Moreover, how these mechanisms vary depending on the particular architecture used for the encoder and decoder (recurrent, feed-forward, etc.)

Paper
Add Code

Practical tradeoffs between memory, compute, and performance in learned optimizers

1 code implementation • 22 Mar 2022 • Luke Metz, C. Daniel Freeman, James Harrison, Niru Maheswaranathan, Jascha Sohl-Dickstein

We further leverage our analysis to construct a learned optimizer that is both faster and more memory efficient than previous work.

731

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.