BART is a denoising autoencoder for pretraining sequence-to-sequence models. It is trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text. It uses a standard Transformer-based neural machine translation architecture. It uses a standard seq2seq/NMT architecture with a bidirectional encoder (like BERT) and a left-to-right decoder (like GPT). This means the encoder's attention mask is fully visible, like BERT, and the decoder's attention mask is causal, like GPT2.
Source: BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and ComprehensionPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Retrieval | 124 | 12.73% |
Question Answering | 79 | 8.11% |
Language Modelling | 70 | 7.19% |
Text Generation | 64 | 6.57% |
Abstractive Text Summarization | 45 | 4.62% |
Sentence | 39 | 4.00% |
Text Summarization | 27 | 2.77% |
Information Retrieval | 21 | 2.16% |
Large Language Model | 20 | 2.05% |