Through extensive experiments, we demonstrate that they outperform existing distillation techniques for diffusion models in one- and few-step sampling, achieving the new state-of-the-art FID of 3. 55 on CIFAR-10 and 6. 20 on ImageNet 64x64 for one-step generation.
Ranked #9 on Image Generation on ImageNet 64x64
This is in stark contrast to state-of-the-art generative image models, which produce samples in a number of seconds or minutes.
Contrastive models like CLIP have been shown to learn robust representations of images that capture both semantics and style.
Ranked #27 on Text-to-Image Generation on COCO (using extra training data)
Diffusion models have recently been shown to generate high-quality synthetic images, especially when paired with a guidance technique to trade off diversity for fidelity.
Ranked #32 on Text-to-Image Generation on COCO (using extra training data)
Finally, we find that classifier guidance combines well with upsampling diffusion models, further improving FID to 3. 94 on ImageNet 256$\times$256 and 3. 85 on ImageNet 512$\times$512.
Ranked #1 on Image Generation on LSUN Bedroom 256 x 256 (FD metric)
Denoising diffusion probabilistic models (DDPM) are a class of generative models which have recently been shown to produce excellent samples.
Ranked #4 on Image Generation on CIFAR-10 (FD metric)
no code implementations • 28 Oct 2020 • Tom Henighan, Jared Kaplan, Mor Katz, Mark Chen, Christopher Hesse, Jacob Jackson, Heewoo Jun, Tom B. Brown, Prafulla Dhariwal, Scott Gray, Chris Hallacy, Benjamin Mann, Alec Radford, Aditya Ramesh, Nick Ryder, Daniel M. Ziegler, John Schulman, Dario Amodei, Sam McCandlish
The optimal model size also depends on the compute budget through a power-law, with exponents that are nearly universal across all data domains.
Inspired by progress in unsupervised representation learning for natural language, we examine whether similar models can learn useful representations for images.
Ranked #14 on Image Classification on STL-10 (using extra training data)
40 code implementations • • Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei
By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do.
Flow-based generative models (Dinh et al., 2014) are conceptually attractive due to tractability of the exact log-likelihood, tractability of exact latent-variable inference, and parallelizability of both training and synthesis.
Ranked #10 on Image Generation on CelebA 256x256
In this paper, we introduce a system called GamePad that can be used to explore the application of machine learning methods to theorem proving in the Coq proof assistant.
We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent.
Combining parameter noise with traditional RL methods allows to combine the best of both worlds.
Representation learning seeks to expose certain aspects of observed data in a learned representation that's amenable to downstream tasks like classification.