48 code implementations • ICLR 2022 • Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen
We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks.
1 code implementation • 6 Sep 2019 • MohamadAli Torkamani, Shiv Shankar, Amirmohammad Rooshenas, Phillip Wallis
Most deep neural networks use simple, fixed activation functions, such as sigmoids or rectified linear units, regardless of domain or network structure.
no code implementations • 19 May 2019 • MohamadAli Torkamani, Phillip Wallis, Shiv Shankar, Amirmohammad Rooshenas
Most deep neural networks use simple, fixed activation functions, such as sigmoids or rectified linear units, regardless of domain or network structure.
no code implementations • 27 Sep 2018 • MohamadAli Torkamani, Phillip Wallis
Most deep neural networks use simple, fixed activation functions, such as sigmoids or rectified linear units, regardless of domain or network structure.