Search Results for author: Caio C. T. Mendes

Found 1 papers, 1 papers with code

LiteTransformerSearch: Training-free Neural Architecture Search for Efficient Language Models

1 code implementation4 Mar 2022 Mojan Javaheripi, Gustavo H. de Rosa, Subhabrata Mukherjee, Shital Shah, Tomasz L. Religa, Caio C. T. Mendes, Sebastien Bubeck, Farinaz Koushanfar, Debadeepta Dey

Results show that the perplexity of 16-layer GPT-2 and Transformer-XL can be achieved with up to 1. 5x, 2. 5x faster runtime and 1. 2x, 2. 0x lower peak memory utilization.

Decoder Language Modelling +1

Cannot find the paper you are looking for? You can Submit a new open access paper.