1 code implementation • 30 Dec 2022 • Tom Young, Yunan Chen, Yang You
Learning to predict masked tokens in a sequence has been shown to be a helpful pretraining objective for powerful language models such as PaLM2.
LAMBADA MMLU