Search Results for author: Thomas Bachlechner

Found 1 papers, 1 papers with code

ReZero is All You Need: Fast Convergence at Large Depth

13 code implementations10 Mar 2020 Thomas Bachlechner, Bodhisattwa Prasad Majumder, Huanru Henry Mao, Garrison W. Cottrell, Julian McAuley

Deep networks often suffer from vanishing or exploding gradients due to inefficient signal propagation, leading to long training times or convergence difficulties.

Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.