Search Results for author: Maithra Raghu

Found 21 papers, 16 papers with code

On the Origins of the Block Structure Phenomenon in Neural Network Representations

1 code implementation15 Feb 2022 Thao Nguyen, Maithra Raghu, Simon Kornblith

Recent work has uncovered a striking phenomenon in large-capacity neural networks: they contain blocks of contiguous hidden layers with highly similar representations.

Dominant Datapoints and the Block Structure Phenomenon in Neural Network Hidden Representations

no code implementations29 Sep 2021 Thao Nguyen, Maithra Raghu, Simon Kornblith

Recent work has uncovered a striking phenomenon in large-capacity neural networks: they contain blocks of contiguous hidden layers with highly similar representations.

Do Vision Transformers See Like Convolutional Neural Networks?

4 code implementations NeurIPS 2021 Maithra Raghu, Thomas Unterthiner, Simon Kornblith, Chiyuan Zhang, Alexey Dosovitskiy

Finally, we study the effect of (pretraining) dataset scale on intermediate features and transfer learning, and conclude with a discussion on connections to new architectures such as the MLP-Mixer.

Classification Image Classification +1

Pointer Value Retrieval: A new benchmark for understanding the limits of neural network generalization

2 code implementations27 Jul 2021 Chiyuan Zhang, Maithra Raghu, Jon Kleinberg, Samy Bengio

In PVR, this is done by having one part of the task input act as a pointer, giving instructions on a different input location, which forms the output.

Memorization Retrieval

Teaching with Commentaries

1 code implementation ICLR 2021 Aniruddh Raghu, Maithra Raghu, Simon Kornblith, David Duvenaud, Geoffrey Hinton

We find that commentaries can improve training speed and/or performance, and provide insights about the dataset and training process.

Data Augmentation

Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth

4 code implementations ICLR 2021 Thao Nguyen, Maithra Raghu, Simon Kornblith

We begin by investigating how varying depth and width affects model hidden representations, finding a characteristic block structure in the hidden representations of larger capacity (wider or deeper) models.

Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics

no code implementations ICLR 2021 Vinay V. Ramasesh, Ethan Dyer, Maithra Raghu

A central challenge in developing versatile machine learning systems is catastrophic forgetting: a model trained on tasks in sequence will suffer significant performance drops on earlier tasks.

Anatomy Split-CIFAR-10

A Survey of Deep Learning for Scientific Discovery

1 code implementation26 Mar 2020 Maithra Raghu, Eric Schmidt

Over the past few years, we have seen fundamental breakthroughs in core problems in machine learning, largely driven by advances in deep neural networks.

Deep Learning scientific discovery +1

The Algorithmic Automation Problem: Prediction, Triage, and Human Effort

1 code implementation28 Mar 2019 Maithra Raghu, Katy Blumer, Greg Corrado, Jon Kleinberg, Ziad Obermeyer, Sendhil Mullainathan

In a wide array of areas, algorithms are matching and surpassing the performance of human experts, leading to consideration of the roles of human judgment and algorithmic prediction in these domains.

Transfusion: Understanding Transfer Learning for Medical Imaging

2 code implementations NeurIPS 2019 Maithra Raghu, Chiyuan Zhang, Jon Kleinberg, Samy Bengio

Investigating the learned representations and features, we find that some of the differences from transfer learning are due to the over-parametrization of standard models rather than sophisticated feature reuse.

Image Classification Transfer Learning

Direct Uncertainty Prediction for Medical Second Opinions

no code implementations4 Jul 2018 Maithra Raghu, Katy Blumer, Rory Sayres, Ziad Obermeyer, Robert Kleinberg, Sendhil Mullainathan, Jon Kleinberg

Our central methodological finding is that Direct Uncertainty Prediction (DUP), training a model to predict an uncertainty score directly from the raw patient features, works better than Uncertainty Via Classification, the two-step process of training a classifier and postprocessing the output distribution to give an uncertainty score.

BIG-bench Machine Learning General Classification

Insights on representational similarity in neural networks with canonical correlation

2 code implementations NeurIPS 2018 Ari S. Morcos, Maithra Raghu, Samy Bengio

Comparing representations in neural networks is fundamentally difficult as the structure of representations varies greatly, even across groups of networks trained on identical tasks, and over the course of training.

Adversarial Spheres

2 code implementations ICLR 2018 Justin Gilmer, Luke Metz, Fartash Faghri, Samuel S. Schoenholz, Maithra Raghu, Martin Wattenberg, Ian Goodfellow

We hypothesize that this counter intuitive behavior is a naturally occurring result of the high dimensional geometry of the data manifold.

Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games?

1 code implementation ICML 2018 Maithra Raghu, Alex Irpan, Jacob Andreas, Robert Kleinberg, Quoc V. Le, Jon Kleinberg

Deep reinforcement learning has achieved many recent successes, but our understanding of its strengths and limitations is hampered by the lack of rich environments in which we can fully characterize optimal behavior, and correspondingly diagnose individual actions against such a characterization.

Deep Reinforcement Learning reinforcement-learning +1

SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability

3 code implementations NeurIPS 2017 Maithra Raghu, Justin Gilmer, Jason Yosinski, Jascha Sohl-Dickstein

We propose a new technique, Singular Vector Canonical Correlation Analysis (SVCCA), a tool for quickly comparing two representations in a way that is both invariant to affine transform (allowing comparison between different layers and networks) and fast to compute (allowing more comparisons to be calculated than with previous methods).

Linear Additive Markov Processes

1 code implementation5 Apr 2017 Ravi Kumar, Maithra Raghu, Tamas Sarlos, Andrew Tomkins

We introduce LAMP: the Linear Additive Markov Process.

Survey of Expressivity in Deep Neural Networks

no code implementations24 Nov 2016 Maithra Raghu, Ben Poole, Jon Kleinberg, Surya Ganguli, Jascha Sohl-Dickstein

This quantity grows exponentially in the depth of the network, and is responsible for the depth sensitivity observed.

Survey

On the Expressive Power of Deep Neural Networks

no code implementations ICML 2017 Maithra Raghu, Ben Poole, Jon Kleinberg, Surya Ganguli, Jascha Sohl-Dickstein

We propose a new approach to the problem of neural network expressivity, which seeks to characterize how structural properties of a neural network family affect the functions it is able to compute.

Exponential expressivity in deep neural networks through transient chaos

1 code implementation NeurIPS 2016 Ben Poole, Subhaneil Lahiri, Maithra Raghu, Jascha Sohl-Dickstein, Surya Ganguli

We combine Riemannian geometry with the mean field theory of high dimensional chaos to study the nature of signal propagation in generic, deep neural networks with random weights.

Cannot find the paper you are looking for? You can Submit a new open access paper.