Transformers

Charformer is a type of Transformer model that learns a subword tokenization end-to-end as part of the model. Specifically it uses GBST that automatically learns latent subword representations from characters in a data-driven fashion. Following GBST, the soft subword sequence is passed through Transformer layers.

Source: Charformer: Fast Character Transformers via Gradient-based Subword Tokenization

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Decoder 2 15.38%
NMT 2 15.38%
Denoising 1 7.69%
Image Denoising 1 7.69%
Translation 1 7.69%
Toxic Comment Classification 1 7.69%
Linguistic Acceptability 1 7.69%
Natural Language Inference 1 7.69%
Paraphrase Identification 1 7.69%

Categories