# Generating Long Sequences with Sparse Transformers

Transformers are powerful sequence models, but require time and memory that grows quadratically with the sequence length.

SOTA for Image Generation on CIFAR-10 (NLL Test metric )

962

27