AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model

no code implementations2 Aug 2022 Saleh Soltan, Shankar Ananthakrishnan, Jack FitzGerald, Rahul Gupta, Wael Hamza, Haidar Khan, Charith Peris, Stephen Rawls, Andy Rosenbaum, Anna Rumshisky, Chandana Satya Prakash, Mukund Sridhar, Fabian Triefenbach, Apurv Verma, Gokhan Tur, Prem Natarajan

In this work, we demonstrate that multilingual large-scale sequence-to-sequence (seq2seq) models, pre-trained on a mixture of denoising and Causal Language Modeling (CLM) tasks, are more efficient few-shot learners than decoder-only models on various tasks.

Causal Language Modeling Denoising +3

STIL - Simultaneous Slot Filling, Translation, Intent Classification, and Language Identification: Initial Results using mBART on MultiATIS++

1 code implementation Asian Chapter of the Association for Computational Linguistics 2020 Jack FitzGerald

When no translation is performed, mBART{'}s performance is comparable to the current state of the art system (Cross-Lingual BERT by Xu et al. (2020)) for the languages tested, with better average intent classification accuracy (96. 07{\%} versus 95. 50{\%}) but worse average slot F1 (89. 87{\%} versus 90. 81{\%}).

Classification Intent Classification +4

