no code implementations • 29 Sep 2021 • Matej Kosec, Sheng Fu, Mario Michael Krell
The shortest-pack-first histogram-packing (SPFHP) algorithm determines the packing order for the Wikipedia dataset of over 16M sequences in 0. 02 seconds.
1 code implementation • NeurIPS 2021 • Mario Michael Krell, Matej Kosec, Sergio P. Perez, Andrew Fitzgibbon
We show in this paper that the variation in sequence lengths in common NLP datasets is such that up to 50% of all tokens can be padding.