1 code implementation • 21 Jan 2024 • Katherine Crowson, Stefan Andreas Baumann, Alex Birch, Tanishq Mathew Abraham, Daniel Z. Kaplan, Enrico Shippole
We present the Hourglass Diffusion Transformer (HDiT), an image generative model that exhibits linear scaling with pixel count, supporting training at high-resolution (e. g. $1024 \times 1024$) directly in pixel-space.
1 code implementation • 25 Oct 2023 • Shayne Longpre, Robert Mahari, Anthony Chen, Naana Obeng-Marnu, Damien Sileo, William Brannon, Niklas Muennighoff, Nathan Khazam, Jad Kabbara, Kartik Perisetla, Xinyi Wu, Enrico Shippole, Kurt Bollacker, Tongshuang Wu, Luis Villa, Sandy Pentland, Sara Hooker
The race to train language models on vast, diverse, and inconsistently documented datasets has raised pressing concerns about the legal and ethical risks for practitioners.
5 code implementations • 31 Aug 2023 • Bowen Peng, Jeffrey Quesnelle, Honglu Fan, Enrico Shippole
Rotary Position Embeddings (RoPE) have been shown to effectively encode positional information in transformer-based language models.