Search Results for author: Tim Dockhorn

Found 12 papers, 9 papers with code

Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation

no code implementations • 18 Mar 2024 • Axel Sauer, Frederic Boesel, Tim Dockhorn, Andreas Blattmann, Patrick Esser, Robin Rombach

Distillation methods, like the recently introduced adversarial diffusion distillation (ADD) aim to shift the model from many-shot to single-step inference, albeit at the cost of expensive and difficult optimization due to its reliance on a fixed pretrained DINOv2 discriminator.

Image Generation

Paper
Add Code

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

1 code implementation • 5 Mar 2024 • Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas Müller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, Dustin Podell, Tim Dockhorn, Zion English, Kyle Lacey, Alex Goodwin, Yannik Marek, Robin Rombach

Rectified flow is a recent generative model formulation that connects data and noise in a straight line.

Reading Comprehension Text-to-Image Generation

Paper
Code

Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

2 code implementations • None 2023 • Andreas Blattmann, Tim Dockhorn, Sumith Kulal, Daniel Mendelevitch, Maciej Kilian, Dominik Lorenz, Yam Levi, Zion English, Vikram Voleti, Adam Letts, Varun Jampani, Robin Rombach

We then explore the impact of finetuning our base model on high-quality data and train a text-to-video model that is competitive with closed-source video generation.

Image Generation Image to Video Generation

22,088

Paper
Code

SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

3 code implementations • 4 Jul 2023 • Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas Müller, Joe Penna, Robin Rombach

We present SDXL, a latent diffusion model for text-to-image synthesis.

Image Generation

22,088

Paper
Code

Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models

2 code implementations • CVPR 2023 • Andreas Blattmann, Robin Rombach, Huan Ling, Tim Dockhorn, Seung Wook Kim, Sanja Fidler, Karsten Kreis

We first pre-train an LDM on images only; then, we turn the image generator into a video generator by introducing a temporal dimension to the latent space diffusion model and fine-tuning on encoded image sequences, i. e., videos.

Ranked #5 on Text-to-Video Generation on MSR-VTT (CLIP-FID metric)

Image Generation Text-to-Video Generation +3

22,088

Paper
Code

Latent Space Diffusion Models of Cryo-EM Structures

no code implementations • 25 Nov 2022 • Karsten Kreis, Tim Dockhorn, Zihao Li, Ellen Zhong

The state-of-the-art method cryoDRGN uses a Variational Autoencoder (VAE) framework to learn a continuous distribution of protein structures from single particle cryo-EM imaging data.