Search Results for author: Nikhil Ghosh

Found 8 papers, 2 papers with code

LoRA+: Efficient Low Rank Adaptation of Large Models

1 code implementation19 Feb 2024 Soufiane Hayou, Nikhil Ghosh, Bin Yu

In this paper, we show that Low Rank Adaptation (LoRA) as originally introduced in Hu et al. (2021) leads to suboptimal finetuning of models with large width (embedding dimension).

The Effect of SGD Batch Size on Autoencoder Learning: Sparsity, Sharpness, and Feature Learning

no code implementations6 Aug 2023 Nikhil Ghosh, Spencer Frei, Wooseok Ha, Bin Yu

On the other hand, for any batch size strictly smaller than the number of samples, SGD finds a global minimum which is sparse and nearly orthogonal to its initialization, showing that the randomness of stochastic gradients induces a qualitatively different type of "feature selection" in this setting.

feature selection

The Power of External Memory in Increasing Predictive Model Capacity

no code implementations31 Jan 2023 Cenk Baykal, Dylan J Cutler, Nishanth Dikkala, Nikhil Ghosh, Rina Panigrahy, Xin Wang

One way of introducing sparsity into deep networks is by attaching an external table of parameters that is sparsely looked up at different layers of the network.

Language Modelling

A Universal Trade-off Between the Model Size, Test Loss, and Training Loss of Linear Predictors

no code implementations23 Jul 2022 Nikhil Ghosh, Mikhail Belkin

Remarkably, while the Marchenko-Pastur analysis is far more precise near the interpolation peak, where the number of parameters is just enough to fit the training data, it coincides exactly with the distribution independent bound as the level of overparametrization increases.

Deconstructing Distributions: A Pointwise Framework of Learning

1 code implementation20 Feb 2022 Gal Kaplun, Nikhil Ghosh, Saurabh Garg, Boaz Barak, Preetum Nakkiran

In this work, we propose a new approach: we measure the performance of a collection of models when evaluated on a $\textit{single input point}$.

Landmark Ordinal Embedding

no code implementations NeurIPS 2019 Nikhil Ghosh, Yuxin Chen, Yisong Yue

In this paper, we aim to learn a low-dimensional Euclidean representation from a set of constraints of the form "item j is closer to item i than item k".

Computational Efficiency

Cannot find the paper you are looking for? You can Submit a new open access paper.