no code implementations • 1 Mar 2023 • Diederik P. Kingma, Ruiqi Gao
Diffusion models in the literature are optimized with various objectives that are special cases of a weighted loss, where the weighting function specifies the weight per noise level.
1 code implementation • 6 Oct 2022 • Chenlin Meng, Robin Rombach, Ruiqi Gao, Diederik P. Kingma, Stefano Ermon, Jonathan Ho, Tim Salimans
For standard diffusion models trained on the pixel-space, our approach is able to generate images visually comparable to that of the original model using as few as 4 sampling steps on ImageNet 64x64 and CIFAR-10, achieving FID/IS scores comparable to that of the original model while being up to 256 times faster to sample from.
no code implementations • 5 Oct 2022 • Jonathan Ho, William Chan, Chitwan Saharia, Jay Whang, Ruiqi Gao, Alexey Gritsenko, Diederik P. Kingma, Ben Poole, Mohammad Norouzi, David J. Fleet, Tim Salimans
We present Imagen Video, a text-conditional video generation system based on a cascade of video diffusion models.
Ranked #1 on
Video Generation
on LAION-400M
4 code implementations • 1 Jul 2021 • Diederik P. Kingma, Tim Salimans, Ben Poole, Jonathan Ho
In addition, we show that the continuous-time VLB is invariant to the noise schedule, except for the signal-to-noise ratio at its endpoints.
Ranked #1 on
Image Generation
on CIFAR-10
(bits/dimension metric)
2 code implementations • 9 Jan 2021 • Yang song, Diederik P. Kingma
Energy-Based Models (EBMs), also known as non-normalized probabilistic models, specify probability density or mass functions up to an unknown normalizing constant.
2 code implementations • ICLR 2021 • Ruiqi Gao, Yang song, Ben Poole, Ying Nian Wu, Diederik P. Kingma
Inspired by recent progress on diffusion probabilistic models, we present a diffusion recovery likelihood method to tractably learn and sample from a sequence of EBMs trained on increasingly noisy versions of a dataset.
Ranked #13 on
Image Generation
on CelebA 64x64
8 code implementations • ICLR 2021 • Yang song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, Ben Poole
Combined with multiple architectural improvements, we achieve record-breaking performance for unconditional image generation on CIFAR-10 with an Inception score of 9. 89 and FID of 2. 20, a competitive likelihood of 2. 99 bits/dim, and demonstrate high fidelity generation of 1024 x 1024 images for the first time from a score-based generative model.
Ranked #11 on
Image Generation
on CIFAR-10
no code implementations • 6 Nov 2020 • Ron J. Weiss, RJ Skerry-Ryan, Eric Battenberg, Soroosh Mariooryad, Diederik P. Kingma
We describe a sequence-to-sequence neural network which directly generates speech waveforms from text inputs.
no code implementations • 1 Jul 2020 • Geoffrey Roeder, Luke Metz, Diederik P. Kingma
Identifiability is a desirable property of a statistical model: it implies that the true model parameters may be estimated to any desired precision, given sufficient computational resources and data.
1 code implementation • NeurIPS 2020 • Ilyes Khemakhem, Ricardo Pio Monti, Diederik P. Kingma, Aapo Hyvärinen
We consider the identifiability theory of probabilistic models and establish sufficient conditions under which the representations learned by a very broad family of conditional energy-based models are unique in function space, up to a simple transformation.
2 code implementations • CVPR 2020 • Ruiqi Gao, Erik Nijkamp, Diederik P. Kingma, Zhen Xu, Andrew M. Dai, Ying Nian Wu
(2) The update of the flow model approximately minimizes the Jensen-Shannon divergence between the flow model and the data distribution.
2 code implementations • 10 Jul 2019 • Ilyes Khemakhem, Diederik P. Kingma, Ricardo Pio Monti, Aapo Hyvärinen
We address this issue by showing that for a broad family of deep latent-variable models, identification of the true joint distribution over observed and latent variables is actually possible up to very simple transformations, thus achieving a principled and powerful form of disentanglement.
6 code implementations • 6 Jun 2019 • Diederik P. Kingma, Max Welling
Variational autoencoders provide a principled framework for learning deep latent-variable models and corresponding inference models.
27 code implementations • NeurIPS 2018 • Diederik P. Kingma, Prafulla Dhariwal
Flow-based generative models (Dinh et al., 2014) are conceptually attractive due to tractability of the exact log-likelihood, tractability of exact latent-variable inference, and parallelizability of both training and synthesis.
Ranked #9 on
Image Generation
on CelebA-HQ 256x256
no code implementations • ICLR 2018 • Christos Louizos, Max Welling, Diederik P. Kingma
We further propose the \emph{hard concrete} distribution for the gates, which is obtained by ``stretching'' a binary concrete distribution and then transforming its samples with a hard-sigmoid.
4 code implementations • 4 Dec 2017 • Christos Louizos, Max Welling, Diederik P. Kingma
We further propose the \emph{hard concrete} distribution for the gates, which is obtained by "stretching" a binary concrete distribution and then transforming its samples with a hard-sigmoid.
7 code implementations • 19 Jan 2017 • Tim Salimans, Andrej Karpathy, Xi Chen, Diederik P. Kingma
1) We use a discretized logistic mixture likelihood on the pixels, rather than a 256-way softmax, which we find to speed up training.
no code implementations • 8 Nov 2016 • Xi Chen, Diederik P. Kingma, Tim Salimans, Yan Duan, Prafulla Dhariwal, John Schulman, Ilya Sutskever, Pieter Abbeel
Representation learning seeks to expose certain aspects of observed data in a learned representation that's amenable to downstream tasks like classification.
8 code implementations • 15 Jun 2016 • Diederik P. Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, Max Welling
The framework of normalizing flows provides a general strategy for flexible variational inference of posteriors over latent variables.
9 code implementations • NeurIPS 2016 • Tim Salimans, Diederik P. Kingma
We present weight normalization: a reparameterization of the weight vectors in a neural network that decouples the length of those weight vectors from their direction.
11 code implementations • NeurIPS 2015 • Diederik P. Kingma, Tim Salimans, Max Welling
Our method allows inference of more flexibly parameterized posteriors; specifically, we propose variational dropout, a generalization of Gaussian dropout where the dropout rates are learned, often leading to better models.
no code implementations • 29 Apr 2015 • Jascha Sohl-Dickstein, Diederik P. Kingma
We observe that the standard log likelihood training objective for a Recurrent Neural Network (RNN) model of time series data is equivalent to a variational Bayesian training objective, given the proper choice of generative and inference models.
77 code implementations • 22 Dec 2014 • Diederik P. Kingma, Jimmy Ba
We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments.
no code implementations • 23 Oct 2014 • Tim Salimans, Diederik P. Kingma, Max Welling
Recent advances in stochastic gradient variational inference have made it possible to perform variational Bayesian inference with posterior approximations containing auxiliary random variables.
17 code implementations • NeurIPS 2014 • Diederik P. Kingma, Danilo J. Rezende, Shakir Mohamed, Max Welling
The ever-increasing size of modern data sets combined with the difficulty of obtaining label information has made semi-supervised learning one of the problems of significant practical importance in modern data analysis.
Ranked #53 on
Image Classification
on SVHN
no code implementations • 3 Feb 2014 • Diederik P. Kingma, Max Welling
Hierarchical Bayesian networks and neural networks with stochastic hidden units are commonly perceived as two separate types of models.
129 code implementations • 20 Dec 2013 • Diederik P. Kingma, Max Welling
First, we show that a reparameterization of the variational lower bound yields a lower bound estimator that can be straightforwardly optimized using standard stochastic gradient methods.
Ranked #11 on
Image Clustering
on Tiny-ImageNet
no code implementations • 4 Jun 2013 • Diederik P. Kingma
We propose a technique for increasing the efficiency of gradient-based inference and learning in Bayesian networks with multiple layers of continuous latent vari- ables.