Search Results for author: Adityanarayanan Radhakrishnan

Found 21 papers, 6 papers with code

Linear Recursive Feature Machines provably recover low-rank matrices

1 code implementation9 Jan 2024 Adityanarayanan Radhakrishnan, Mikhail Belkin, Dmitriy Drusvyatskiy

A possible explanation is that common training algorithms for neural networks implicitly perform dimensionality reduction - a process called feature learning.

Dimensionality Reduction Low-Rank Matrix Completion +1

Mechanism of feature learning in convolutional neural networks

1 code implementation1 Sep 2023 Daniel Beaglehole, Adityanarayanan Radhakrishnan, Parthe Pandit, Mikhail Belkin

We then demonstrate the generality of our result by using the patch-based AGOP to enable deep feature learning in convolutional kernel machines.

Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning

no code implementations7 Jun 2023 Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin

In this paper, we first present an explanation regarding the common occurrence of spikes in the training loss when neural networks are trained with stochastic gradient descent (SGD).

Transfer Learning with Kernel Methods

no code implementations1 Nov 2022 Adityanarayanan Radhakrishnan, Max Ruiz Luyten, Neha Prasad, Caroline Uhler

In this work, we propose a transfer learning framework for kernel methods by projecting and translating the source model to the target task.

Image Classification Transfer Learning

Quadratic models for understanding catapult dynamics of neural networks

1 code implementation24 May 2022 Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin

While neural networks can be approximated by linear models as their width increases, certain properties of wide neural networks cannot be captured by linear models.

Wide and Deep Neural Networks Achieve Optimality for Classification

no code implementations29 Apr 2022 Adityanarayanan Radhakrishnan, Mikhail Belkin, Caroline Uhler

In this work, we identify and construct an explicit set of neural network classifiers that achieve optimality.

Classification

Local Quadratic Convergence of Stochastic Gradient Descent with Adaptive Step Size

no code implementations30 Dec 2021 Adityanarayanan Radhakrishnan, Mikhail Belkin, Caroline Uhler

Establishing a fast rate of convergence for optimization methods is crucial to their applicability in practice.

A Mechanism for Producing Aligned Latent Spaces with Autoencoders

no code implementations29 Jun 2021 Saachi Jain, Adityanarayanan Radhakrishnan, Caroline Uhler

Aligned latent spaces, where meaningful semantic shifts in the input space correspond to a translation in the embedding space, play an important role in the success of downstream tasks such as unsupervised clustering and data imputation.

Clustering Imputation +1

LLBoost: Last Layer Perturbation to Boost Pre-trained Neural Networks

no code implementations1 Jan 2021 Adityanarayanan Radhakrishnan, Neha Prasad, Caroline Uhler

While deep networks have produced state-of-the-art results in several domains from image classification to machine translation, hyper-parameter selection remains a significant computational bottleneck.

Image Classification Machine Translation

Increasing Depth Leads to U-Shaped Test Risk in Over-parameterized Convolutional Networks

no code implementations19 Oct 2020 Eshaan Nichani, Adityanarayanan Radhakrishnan, Caroline Uhler

We then present a novel linear regression framework for characterizing the impact of depth on test risk, and show that increasing depth leads to a U-shaped test risk for the linear CNTK.

Image Classification Open-Ended Question Answering +1

Do Deeper Convolutional Networks Perform Better?

no code implementations28 Sep 2020 Eshaan Nichani, Adityanarayanan Radhakrishnan, Caroline Uhler

Recent work provided an explanation for this phenomenon by introducing the double descent curve, showing that increasing model capacity past the interpolation threshold leads to a decrease in test error.

Learning Theory

Linear Convergence and Implicit Regularization of Generalized Mirror Descent with Time-Dependent Mirrors

no code implementations28 Sep 2020 Adityanarayanan Radhakrishnan, Mikhail Belkin, Caroline Uhler

The following questions are fundamental to understanding the properties of over-parameterization in modern machine learning: (1) Under what conditions and at what rate does training converge to a global minimum?

Linear Convergence of Generalized Mirror Descent with Time-Dependent Mirrors

no code implementations18 Sep 2020 Adityanarayanan Radhakrishnan, Mikhail Belkin, Caroline Uhler

GMD subsumes popular first order optimization methods including gradient descent, mirror descent, and preconditioned gradient descent methods such as Adagrad.

On Alignment in Deep Linear Neural Networks

no code implementations13 Mar 2020 Adityanarayanan Radhakrishnan, Eshaan Nichani, Daniel Bernstein, Caroline Uhler

We define alignment for fully connected networks with multidimensional outputs and show that it is a natural extension of alignment in networks with 1-dimensional outputs as defined by Ji and Telgarsky, 2018.

Overparameterized Neural Networks Implement Associative Memory

1 code implementation26 Sep 2019 Adityanarayanan Radhakrishnan, Mikhail Belkin, Caroline Uhler

Identifying computational mechanisms for memorization and retrieval of data is a long-standing problem at the intersection of machine learning and neuroscience.

Memorization Retrieval

Overparameterized Neural Networks Can Implement Associative Memory

no code implementations25 Sep 2019 Adityanarayanan Radhakrishnan, Mikhail Belkin, Caroline Uhler

Identifying computational mechanisms for memorization and retrieval is a long-standing problem at the intersection of machine learning and neuroscience.

Memorization Retrieval

Downsampling leads to Image Memorization in Convolutional Autoencoders

no code implementations ICLR 2019 Adityanarayanan Radhakrishnan, Caroline Uhler, Mikhail Belkin

In this paper, we link memorization of images in deep convolutional autoencoders to downsampling through strided convolution.

Memorization

Memorization in Overparameterized Autoencoders

no code implementations ICML Workshop Deep_Phenomen 2019 Adityanarayanan Radhakrishnan, Karren Yang, Mikhail Belkin, Caroline Uhler

The ability of deep neural networks to generalize well in the overparameterized regime has become a subject of significant research interest.

Inductive Bias Memorization

Patchnet: Interpretable Neural Networks for Image Classification

no code implementations23 May 2017 Adityanarayanan Radhakrishnan, Charles Durham, Ali Soylemezoglu, Caroline Uhler

Understanding how a complex machine learning model makes a classification decision is essential for its acceptance in sensitive areas such as health care.

BIG-bench Machine Learning Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.