1 code implementation • 9 Jan 2024 • Adityanarayanan Radhakrishnan, Mikhail Belkin, Dmitriy Drusvyatskiy
A possible explanation is that common training algorithms for neural networks implicitly perform dimensionality reduction - a process called feature learning.
1 code implementation • 1 Sep 2023 • Daniel Beaglehole, Adityanarayanan Radhakrishnan, Parthe Pandit, Mikhail Belkin
We then demonstrate the generality of our result by using the patch-based AGOP to enable deep feature learning in convolutional kernel machines.
no code implementations • 7 Jun 2023 • Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin
In this paper, we first present an explanation regarding the common occurrence of spikes in the training loss when neural networks are trained with stochastic gradient descent (SGD).
3 code implementations • 28 Dec 2022 • Adityanarayanan Radhakrishnan, Daniel Beaglehole, Parthe Pandit, Mikhail Belkin
In recent years neural networks have achieved impressive results on many technological and scientific tasks.
no code implementations • 1 Nov 2022 • Adityanarayanan Radhakrishnan, Max Ruiz Luyten, Neha Prasad, Caroline Uhler
In this work, we propose a transfer learning framework for kernel methods by projecting and translating the source model to the target task.
1 code implementation • 24 May 2022 • Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin
While neural networks can be approximated by linear models as their width increases, certain properties of wide neural networks cannot be captured by linear models.
no code implementations • 29 Apr 2022 • Adityanarayanan Radhakrishnan, Mikhail Belkin, Caroline Uhler
In this work, we identify and construct an explicit set of neural network classifiers that achieve optimality.
no code implementations • 30 Dec 2021 • Adityanarayanan Radhakrishnan, Mikhail Belkin, Caroline Uhler
Establishing a fast rate of convergence for optimization methods is crucial to their applicability in practice.
1 code implementation • 31 Jul 2021 • Adityanarayanan Radhakrishnan, George Stefanakis, Mikhail Belkin, Caroline Uhler
Remarkably, taking the width of a neural network to infinity allows for improved computational performance.
no code implementations • 29 Jun 2021 • Saachi Jain, Adityanarayanan Radhakrishnan, Caroline Uhler
Aligned latent spaces, where meaningful semantic shifts in the input space correspond to a translation in the embedding space, play an important role in the success of downstream tasks such as unsupervised clustering and data imputation.
no code implementations • 1 Jan 2021 • Adityanarayanan Radhakrishnan, Neha Prasad, Caroline Uhler
While deep networks have produced state-of-the-art results in several domains from image classification to machine translation, hyper-parameter selection remains a significant computational bottleneck.
no code implementations • 19 Oct 2020 • Eshaan Nichani, Adityanarayanan Radhakrishnan, Caroline Uhler
We then present a novel linear regression framework for characterizing the impact of depth on test risk, and show that increasing depth leads to a U-shaped test risk for the linear CNTK.
no code implementations • 28 Sep 2020 • Eshaan Nichani, Adityanarayanan Radhakrishnan, Caroline Uhler
Recent work provided an explanation for this phenomenon by introducing the double descent curve, showing that increasing model capacity past the interpolation threshold leads to a decrease in test error.
no code implementations • 28 Sep 2020 • Adityanarayanan Radhakrishnan, Mikhail Belkin, Caroline Uhler
The following questions are fundamental to understanding the properties of over-parameterization in modern machine learning: (1) Under what conditions and at what rate does training converge to a global minimum?
no code implementations • 18 Sep 2020 • Adityanarayanan Radhakrishnan, Mikhail Belkin, Caroline Uhler
GMD subsumes popular first order optimization methods including gradient descent, mirror descent, and preconditioned gradient descent methods such as Adagrad.
no code implementations • 13 Mar 2020 • Adityanarayanan Radhakrishnan, Eshaan Nichani, Daniel Bernstein, Caroline Uhler
We define alignment for fully connected networks with multidimensional outputs and show that it is a natural extension of alignment in networks with 1-dimensional outputs as defined by Ji and Telgarsky, 2018.
1 code implementation • 26 Sep 2019 • Adityanarayanan Radhakrishnan, Mikhail Belkin, Caroline Uhler
Identifying computational mechanisms for memorization and retrieval of data is a long-standing problem at the intersection of machine learning and neuroscience.
no code implementations • 25 Sep 2019 • Adityanarayanan Radhakrishnan, Mikhail Belkin, Caroline Uhler
Identifying computational mechanisms for memorization and retrieval is a long-standing problem at the intersection of machine learning and neuroscience.
no code implementations • ICLR 2019 • Adityanarayanan Radhakrishnan, Caroline Uhler, Mikhail Belkin
In this paper, we link memorization of images in deep convolutional autoencoders to downsampling through strided convolution.
no code implementations • ICML Workshop Deep_Phenomen 2019 • Adityanarayanan Radhakrishnan, Karren Yang, Mikhail Belkin, Caroline Uhler
The ability of deep neural networks to generalize well in the overparameterized regime has become a subject of significant research interest.
no code implementations • 23 May 2017 • Adityanarayanan Radhakrishnan, Charles Durham, Ali Soylemezoglu, Caroline Uhler
Understanding how a complex machine learning model makes a classification decision is essential for its acceptance in sensitive areas such as health care.