no code implementations • 13 Mar 2025 • Dibyakanti Kumar, Samyak Jha, Anirbit Mukherjee
In this work, we will establish that the Langevin Monte-Carlo algorithm can learn depth-2 neural nets of any size and for any data and we give non-asymptotic convergence rates for it.
1 code implementation • 30 Jul 2024 • Mohan Ren, Zhihao Fang, Keren Li, Anirbit Mukherjee
Physics-Informed Neural Networks (PINNs) is the general method that has evolved for this task but its training is well-known to be very unstable.
1 code implementation • 12 Apr 2024 • Matteo Tucat, Anirbit Mukherjee
In this work, we instantiate a regularized form of the gradient clipping algorithm and prove that it can converge to the global minima of deep neural network loss functions provided that the net is of sufficient width.
no code implementations • 8 Oct 2023 • Dibyakanti Kumar, Anirbit Mukherjee
Physics Informed Neural Networks (PINNs) have been achieving ever newer feats of solving complicated PDEs numerically while offering an attractive trade-off between accuracy and speed of inference.
no code implementations • 7 Oct 2023 • Hongbo Zhu, Angelo Cangelosi, Procheta Sen, Anirbit Mukherjee
This data-efficiency is seen to manifest as LIPEx being able to compute its explanation matrix around 53% faster than all-class LIME, for classification experiments with text data.
no code implementations • 17 Sep 2023 • Pulkit Gopalani, Samyak Jha, Anirbit Mukherjee
In this note, we demonstrate a first-of-its-kind provable convergence of SGD to the global minima of appropriately regularized logistic empirical risk of depth $2$ nets -- for arbitrary data and with any number of gates with adequately smooth and bounded activations like sigmoid and tanh.
no code implementations • 11 Aug 2023 • Anirbit Mukherjee, Amartya Roy
Deep Operator Networks are an increasingly popular paradigm for solving regression in infinite dimensions and hence solve families of PDEs in one shot.
no code implementations • 20 Oct 2022 • Pulkit Gopalani, Anirbit Mukherjee
In this note, we consider appropriately regularized $\ell_2-$empirical risk of depth $2$ nets with any number of gates and show bounds on how the empirical loss evolves for SGD iterates on it -- for arbitrary data and if the activation is adequately smooth and bounded like sigmoid and tanh.
1 code implementation • 23 May 2022 • Pulkit Gopalani, Sayar Karmakar, Dibyakanti Kumar, Anirbit Mukherjee
In recent times machine learning methods have made significant advances in becoming a useful tool for analyzing physical systems.
no code implementations • 26 Apr 2022 • Sayar Karmakar, Anirbit Mukherjee
while training a $\relu$ gate (in the realizable and in the binary classification setup) and for a variant of S. G. D.
1 code implementation • 1 Nov 2021 • Soham Dan, Anirbit Mukherjee, Avirup Das, Phanideep Gampa
On various state-of-the-art neural network training on SVHN, CIFAR-10 and CIFAR-100 we demonstrate how our new proposal of $S_{\rm rel}$, as opposed to the original definition, much more sharply detects the property of the weight updates preferring to make prediction changes within the same class as the sampled data.
no code implementations • NeurIPS Workshop DLDE 2021 • Pulkit Gopalani, Anirbit Mukherjee
DeepONets [1] are one of the most prominent ideas in this theme which entails an optimization over a space of inner-products of neural nets.
1 code implementation • 28 Apr 2021 • Anirbit Mukherjee
In chapter 2 we show new circuit complexity theorems for deep neural functions and prove classification theorems about these function spaces which in turn lead to exact algorithms for empirical risk minimization for depth 2 ReLU nets.
no code implementations • 8 May 2020 • Sayar Karmakar, Anirbit Mukherjee
In this work, we demonstrate provable guarantees on the training of a single ReLU gate in hitherto unexplored regimes.
1 code implementation • 4 May 2020 • Sayar Karmakar, Anirbit Mukherjee, Theodore Papamarkou
In this class of networks, we attempt to learn the network weights in the presence of a malicious oracle doing stochastic, bounded and additive adversarial distortions on the true output during training.
no code implementations • ICLR 2019 • Soham De, Anirbit Mukherjee, Enayat Ullah
Through these experiments we demonstrate the interesting sensitivity that ADAM has to its momentum parameter $\beta_1$.
no code implementations • 8 Nov 2017 • Anirbit Mukherjee, Amitabh Basu
We use the method of sign-rank to show exponential in dimension lower bounds for ReLU circuits ending in a LTF gate and of depths upto $O(n^{\xi})$ with $\xi < \frac{1}{8}$ with some restrictions on the weights in the bottom most layer.
no code implementations • 12 Aug 2017 • Akshay Rangamani, Anirbit Mukherjee, Amitabh Basu, Tejaswini Ganapathy, Ashish Arora, Sang Chin, Trac. D. Tran
This property holds independent of the loss function.
no code implementations • ICLR 2018 • Raman Arora, Amitabh Basu, Poorya Mianjy, Anirbit Mukherjee
In this paper we investigate the family of functions representable by deep neural networks (DNN) with rectified linear units (ReLU).