# When BERT Plays the Lottery, All Tickets Are Winning

1 May 2020Sai PrasannaAnna RogersAnna Rumshisky

Much of the recent success in NLP is due to the large Transformer-based models such as BERT (Devlin et al, 2019). However, these models have been shown to be reducible to a smaller number of self-attention heads and layers... (read more)

PDF Abstract