Search Results for author: Darshil Doshi

Found 3 papers, 1 papers with code

To grok or not to grok: Disentangling generalization and memorization on corrupted algorithmic datasets

1 code implementation19 Oct 2023 Darshil Doshi, Aritra Das, Tianyu He, Andrey Gromov

Robust generalization is a major challenge in deep learning, particularly when the number of trainable parameters is very large.

Memorization

AutoInit: Automatic Initialization via Jacobian Tuning

no code implementations27 Jun 2022 Tianyu He, Darshil Doshi, Andrey Gromov

Good initialization is essential for training Deep Neural Networks (DNNs).

Critical Initialization of Wide and Deep Neural Networks through Partial Jacobians: General Theory and Applications

no code implementations23 Nov 2021 Darshil Doshi, Tianyu He, Andrey Gromov

We derive recurrence relations for the norms of partial Jacobians and utilize these relations to analyze criticality of deep fully connected neural networks with LayerNorm and/or residual connections.

Cannot find the paper you are looking for? You can Submit a new open access paper.