Search Results for author: Edward J. Hu

Found 2 papers, 2 papers with code

LoRA: Low-Rank Adaptation of Large Language Models

1 code implementation17 Jun 2021 Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Weizhu Chen

We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks.

Language Modelling

Feature Learning in Infinite-Width Neural Networks

1 code implementation30 Nov 2020 Greg Yang, Edward J. Hu

However, we show that the standard and NTK parametrizations of a neural network do not admit infinite-width limits that can learn features, which is crucial for pretraining and transfer learning such as with BERT.

Few-Shot Learning Transfer Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.