Search Results for author: Maithra Raghu

Found 18 papers, 12 papers with code

Do Vision Transformers See Like Convolutional Neural Networks?

1 code implementation19 Aug 2021 Maithra Raghu, Thomas Unterthiner, Simon Kornblith, Chiyuan Zhang, Alexey Dosovitskiy

Finally, we study the effect of (pretraining) dataset scale on intermediate features and transfer learning, and conclude with a discussion on connections to new architectures such as the MLP-Mixer.

Classification Image Classification +1

Pointer Value Retrieval: A new benchmark for understanding the limits of neural network generalization

1 code implementation27 Jul 2021 Chiyuan Zhang, Maithra Raghu, Jon Kleinberg, Samy Bengio

In this paper we introduce a novel benchmark, Pointer Value Retrieval (PVR) tasks, that explore the limits of neural network generalization.

Teaching with Commentaries

1 code implementation ICLR 2021 Aniruddh Raghu, Maithra Raghu, Simon Kornblith, David Duvenaud, Geoffrey Hinton

We find that commentaries can improve training speed and/or performance, and provide insights about the dataset and training process.

Data Augmentation

Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth

1 code implementation ICLR 2021 Thao Nguyen, Maithra Raghu, Simon Kornblith

We begin by investigating how varying depth and width affects model hidden representations, finding a characteristic block structure in the hidden representations of larger capacity (wider or deeper) models.

Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics

no code implementations ICLR 2021 Vinay V. Ramasesh, Ethan Dyer, Maithra Raghu

A central challenge in developing versatile machine learning systems is catastrophic forgetting: a model trained on tasks in sequence will suffer significant performance drops on earlier tasks.

A Survey of Deep Learning for Scientific Discovery

1 code implementation26 Mar 2020 Maithra Raghu, Eric Schmidt

Over the past few years, we have seen fundamental breakthroughs in core problems in machine learning, largely driven by advances in deep neural networks.

Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML

2 code implementations ICLR 2020 Aniruddh Raghu, Maithra Raghu, Samy Bengio, Oriol Vinyals

We conclude with a discussion of the rapid learning vs feature reuse question for meta-learning algorithms more broadly.

Few-Shot Image Classification

The Algorithmic Automation Problem: Prediction, Triage, and Human Effort

no code implementations28 Mar 2019 Maithra Raghu, Katy Blumer, Greg Corrado, Jon Kleinberg, Ziad Obermeyer, Sendhil Mullainathan

In a wide array of areas, algorithms are matching and surpassing the performance of human experts, leading to consideration of the roles of human judgment and algorithmic prediction in these domains.

Transfusion: Understanding Transfer Learning for Medical Imaging

2 code implementations NeurIPS 2019 Maithra Raghu, Chiyuan Zhang, Jon Kleinberg, Samy Bengio

Investigating the learned representations and features, we find that some of the differences from transfer learning are due to the over-parametrization of standard models rather than sophisticated feature reuse.

Image Classification Transfer Learning

Direct Uncertainty Prediction for Medical Second Opinions

no code implementations4 Jul 2018 Maithra Raghu, Katy Blumer, Rory Sayres, Ziad Obermeyer, Robert Kleinberg, Sendhil Mullainathan, Jon Kleinberg

Our central methodological finding is that Direct Uncertainty Prediction (DUP), training a model to predict an uncertainty score directly from the raw patient features, works better than Uncertainty Via Classification, the two-step process of training a classifier and postprocessing the output distribution to give an uncertainty score.

General Classification

Insights on representational similarity in neural networks with canonical correlation

2 code implementations NeurIPS 2018 Ari S. Morcos, Maithra Raghu, Samy Bengio

Comparing representations in neural networks is fundamentally difficult as the structure of representations varies greatly, even across groups of networks trained on identical tasks, and over the course of training.

Adversarial Spheres

2 code implementations ICLR 2018 Justin Gilmer, Luke Metz, Fartash Faghri, Samuel S. Schoenholz, Maithra Raghu, Martin Wattenberg, Ian Goodfellow

We hypothesize that this counter intuitive behavior is a naturally occurring result of the high dimensional geometry of the data manifold.

Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games?

1 code implementation ICML 2018 Maithra Raghu, Alex Irpan, Jacob Andreas, Robert Kleinberg, Quoc V. Le, Jon Kleinberg

Deep reinforcement learning has achieved many recent successes, but our understanding of its strengths and limitations is hampered by the lack of rich environments in which we can fully characterize optimal behavior, and correspondingly diagnose individual actions against such a characterization.

SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability

3 code implementations NeurIPS 2017 Maithra Raghu, Justin Gilmer, Jason Yosinski, Jascha Sohl-Dickstein

We propose a new technique, Singular Vector Canonical Correlation Analysis (SVCCA), a tool for quickly comparing two representations in a way that is both invariant to affine transform (allowing comparison between different layers and networks) and fast to compute (allowing more comparisons to be calculated than with previous methods).

Linear Additive Markov Processes

no code implementations5 Apr 2017 Ravi Kumar, Maithra Raghu, Tamas Sarlos, Andrew Tomkins

We introduce LAMP: the Linear Additive Markov Process.

Survey of Expressivity in Deep Neural Networks

no code implementations24 Nov 2016 Maithra Raghu, Ben Poole, Jon Kleinberg, Surya Ganguli, Jascha Sohl-Dickstein

This quantity grows exponentially in the depth of the network, and is responsible for the depth sensitivity observed.

Exponential expressivity in deep neural networks through transient chaos

1 code implementation NeurIPS 2016 Ben Poole, Subhaneil Lahiri, Maithra Raghu, Jascha Sohl-Dickstein, Surya Ganguli

We combine Riemannian geometry with the mean field theory of high dimensional chaos to study the nature of signal propagation in generic, deep neural networks with random weights.

On the Expressive Power of Deep Neural Networks

no code implementations ICML 2017 Maithra Raghu, Ben Poole, Jon Kleinberg, Surya Ganguli, Jascha Sohl-Dickstein

We propose a new approach to the problem of neural network expressivity, which seeks to characterize how structural properties of a neural network family affect the functions it is able to compute.

Cannot find the paper you are looking for? You can Submit a new open access paper.