Language Models

Pythia is a suite of decoder-only autoregressive language models all trained on public data seen in the exact same order and ranging in size from 70M to 12B parameters. The model architecture and hyperparameters largely follow GPT-3, with a few notable deviations based on recent advances in best practices for large scale language modeling.

Source: Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling


Paper Code Results Date Stars


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign