1 code implementation • 10 Jun 2024 • Peize Sun, Yi Jiang, Shoufa Chen, Shilong Zhang, Bingyue Peng, Ping Luo, Zehuan Yuan
(3) A text-conditional image generation model with 775M parameters, from two-stage training on LAION-COCO and high aesthetics quality images, demonstrating competitive performance of visual quality and text alignment.
Ranked #15 on Image Generation on ImageNet 256x256
1 code implementation • 3 Apr 2024 • Keyu Tian, Yi Jiang, Zehuan Yuan, Bingyue Peng, LiWei Wang
We present Visual AutoRegressive modeling (VAR), a new generation paradigm that redefines the autoregressive learning on images as coarse-to-fine "next-scale prediction" or "next-resolution prediction", diverging from the standard raster-scan "next-token prediction".
Ranked #10 on Image Generation on ImageNet 256x256