Search Results for author: Karsten Kreis

Found 31 papers, 14 papers with code

Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models

no code implementations21 Dec 2023 Huan Ling, Seung Wook Kim, Antonio Torralba, Sanja Fidler, Karsten Kreis

We also propose a motion amplification mechanism as well as a new autoregressive synthesis scheme to generate and combine multiple 4D sequences for longer generation.

Synthetic Data Generation Video Generation

A Unified Approach for Text- and Image-guided 4D Scene Generation

no code implementations28 Nov 2023 Yufeng Zheng, Xueting Li, Koki Nagano, Sifei Liu, Karsten Kreis, Otmar Hilliges, Shalini De Mello

Large-scale diffusion generative models are greatly simplifying image, video and 3D asset creation from user-provided text prompts and images.

Scene Generation

WildFusion: Learning 3D-Aware Latent Diffusion Models in View Space

no code implementations22 Nov 2023 Katja Schwarz, Seung Wook Kim, Jun Gao, Sanja Fidler, Andreas Geiger, Karsten Kreis

Then, we train a diffusion model in the 3D-aware latent space, thereby enabling synthesis of high-quality 3D-consistent image samples, outperforming recent state-of-the-art GAN-based methods.

3D-Aware Image Synthesis Depth Estimation +2

TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models

no code implementations ICCV 2023 Tianshi Cao, Karsten Kreis, Sanja Fidler, Nicholas Sharp, Kangxue Yin

We present TexFusion (Texture Diffusion), a new method to synthesize textures for given 3D geometries, using large-scale text-guided image diffusion models.

Denoising Texture Synthesis

DreamTeacher: Pretraining Image Backbones with Deep Generative Models

no code implementations ICCV 2023 Daiqing Li, Huan Ling, Amlan Kar, David Acuna, Seung Wook Kim, Karsten Kreis, Antonio Torralba, Sanja Fidler

In this work, we introduce a self-supervised feature representation learning framework DreamTeacher that utilizes generative networks for pre-training downstream image backbones.

Knowledge Distillation Representation Learning

NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models

no code implementations CVPR 2023 Seung Wook Kim, Bradley Brown, Kangxue Yin, Karsten Kreis, Katja Schwarz, Daiqing Li, Robin Rombach, Antonio Torralba, Sanja Fidler

We first train a scene auto-encoder to express a set of image and pose pairs as a neural field, represented as density and feature voxel grids that can be projected to produce novel views of the scene.

Scene Generation

Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models

2 code implementations CVPR 2023 Andreas Blattmann, Robin Rombach, Huan Ling, Tim Dockhorn, Seung Wook Kim, Sanja Fidler, Karsten Kreis

We first pre-train an LDM on images only; then, we turn the image generator into a video generator by introducing a temporal dimension to the latent space diffusion model and fine-tuning on encoded image sequences, i. e., videos.

Ranked #5 on Text-to-Video Generation on MSR-VTT (CLIP-FID metric)

Image Generation Text-to-Video Generation +3

Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion

no code implementations CVPR 2023 Davis Rempe, Zhengyi Luo, Xue Bin Peng, Ye Yuan, Kris Kitani, Karsten Kreis, Sanja Fidler, Or Litany

We introduce a method for generating realistic pedestrian trajectories and full-body animations that can be controlled to meet user-defined goals.

Collision Avoidance

Score-based Diffusion Models in Function Space

no code implementations14 Feb 2023 Jae Hyun Lim, Nikola B. Kovachki, Ricardo Baptista, Christopher Beckham, Kamyar Azizzadenesheli, Jean Kossaifi, Vikram Voleti, Jiaming Song, Karsten Kreis, Jan Kautz, Christopher Pal, Arash Vahdat, Anima Anandkumar

They consist of a forward process that perturbs input data with Gaussian white noise and a reverse process that learns a score function to generate samples by denoising.

Denoising

Latent Space Diffusion Models of Cryo-EM Structures

no code implementations25 Nov 2022 Karsten Kreis, Tim Dockhorn, Zihao Li, Ellen Zhong

The state-of-the-art method cryoDRGN uses a Variational Autoencoder (VAE) framework to learn a continuous distribution of protein structures from single particle cryo-EM imaging data.

Magic3D: High-Resolution Text-to-3D Content Creation

1 code implementation CVPR 2023 Chen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, Tsung-Yi Lin

DreamFusion has recently demonstrated the utility of a pre-trained text-to-image diffusion model to optimize Neural Radiance Fields (NeRF), achieving remarkable text-to-3D synthesis results.

Text to 3D Vocal Bursts Intensity Prediction

Differentially Private Diffusion Models

1 code implementation18 Oct 2022 Tim Dockhorn, Tianshi Cao, Arash Vahdat, Karsten Kreis

While modern machine learning models rely on increasingly large training datasets, data is often limited in privacy-sensitive domains.

Image Generation

LION: Latent Point Diffusion Models for 3D Shape Generation

2 code implementations12 Oct 2022 Xiaohui Zeng, Arash Vahdat, Francis Williams, Zan Gojcic, Or Litany, Sanja Fidler, Karsten Kreis

To advance 3D DDMs and make them useful for digital artists, we require (i) high generation quality, (ii) flexibility for manipulation and applications such as conditional synthesis and shape interpolation, and (iii) the ability to output smooth surfaces or meshes.

3D Shape Generation Denoising +2

GENIE: Higher-Order Denoising Diffusion Solvers

1 code implementation11 Oct 2022 Tim Dockhorn, Arash Vahdat, Karsten Kreis

Synthesis amounts to solving a differential equation (DE) defined by the learnt model.

Denoising Image Generation

Causal Scene BERT: Improving object detection by searching for challenging groups of data

no code implementations8 Feb 2022 Cinjon Resnick, Or Litany, Amlan Kar, Karsten Kreis, James Lucas, Kyunghyun Cho, Sanja Fidler

Our main contribution is a pseudo-automatic method to discover such groups in foresight by performing causal interventions on simulated scenes.

Autonomous Vehicles object-detection +1

Tackling the Generative Learning Trilemma with Denoising Diffusion GANs

5 code implementations ICLR 2022 Zhisheng Xiao, Karsten Kreis, Arash Vahdat

To the best of our knowledge, denoising diffusion GAN is the first model that reduces sampling cost in diffusion models to an extent that allows them to be applied to real-world applications inexpensively.

Image Generation

Score-Based Generative Modeling with Critically-Damped Langevin Diffusion

1 code implementation ICLR 2022 Tim Dockhorn, Arash Vahdat, Karsten Kreis

SGMs rely on a diffusion process that gradually perturbs the data towards a tractable distribution, while the generative model learns to denoise.

Image Generation

Don’t Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence

no code implementations NeurIPS 2021 Tianshi Cao, Alex Bie, Arash Vahdat, Sanja Fidler, Karsten Kreis

Generative models trained with privacy constraints on private data can sidestep this challenge, providing indirect access to private data instead.

EditGAN: High-Precision Semantic Image Editing

1 code implementation NeurIPS 2021 Huan Ling, Karsten Kreis, Daiqing Li, Seung Wook Kim, Antonio Torralba, Sanja Fidler

EditGAN builds on a GAN framework that jointly models images and their semantic segmentations, requiring only a handful of labeled examples, making it a scalable tool for editing.

Segmentation Semantic Segmentation +1

Don't Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence

1 code implementation1 Nov 2021 Tianshi Cao, Alex Bie, Arash Vahdat, Sanja Fidler, Karsten Kreis

Generative models trained with privacy constraints on private data can sidestep this challenge, providing indirect access to private data instead.

ATISS: Autoregressive Transformers for Indoor Scene Synthesis

1 code implementation NeurIPS 2021 Despoina Paschalidou, Amlan Kar, Maria Shugrina, Karsten Kreis, Andreas Geiger, Sanja Fidler

The ability to synthesize realistic and diverse indoor furniture layouts automatically or based on partial input, unlocks many applications, from better interactive 3D tools to data synthesis for training and simulation.

2D Semantic Segmentation task 1 (8 classes) 3D Semantic Scene Completion +1

Causal Scene BERT: Improving object detection by searching for challenging groups

no code implementations29 Sep 2021 Cinjon Resnick, Or Litany, Amlan Kar, Karsten Kreis, James Lucas, Kyunghyun Cho, Sanja Fidler

We verify that the prioritized groups found via intervention are challenging for the object detector and show that retraining with data collected from these groups helps inordinately compared to adding more IID data.

Autonomous Vehicles object-detection +1

Score-based Generative Modeling in Latent Space

1 code implementation NeurIPS 2021 Arash Vahdat, Karsten Kreis, Jan Kautz

Moving from data to latent space allows us to train more expressive generative models, apply SGMs to non-continuous data, and learn smoother SGMs in a smaller space, resulting in fewer network evaluations and faster sampling.

Ranked #3 on Image Generation on CIFAR-10 (FD metric)

Image Generation

Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes

2 code implementations CVPR 2021 Towaki Takikawa, Joey Litalien, Kangxue Yin, Karsten Kreis, Charles Loop, Derek Nowrouzezahrai, Alec Jacobson, Morgan McGuire, Sanja Fidler

We introduce an efficient neural representation that, for the first time, enables real-time rendering of high-fidelity neural SDFs, while achieving state-of-the-art geometry reconstruction quality.

Differentially Private Generative Models Through Optimal Transport

no code implementations1 Jan 2021 Tianshi Cao, Alex Bie, Karsten Kreis, Sanja Fidler

Generative models trained with privacy constraints on private data can sidestep this challenge and provide indirect access to the private data instead.

Variational Amodal Object Completion

no code implementations NeurIPS 2020 Huan Ling, David Acuna, Karsten Kreis, Seung Wook Kim, Sanja Fidler

In images of complex scenes, objects are often occluding each other which makes perception tasks such as object detection and tracking, or robotic control tasks such as planning, challenging.

Object object-detection +1

VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models

1 code implementation ICLR 2021 Zhisheng Xiao, Karsten Kreis, Jan Kautz, Arash Vahdat

VAEBM captures the overall mode structure of the data distribution using a state-of-the-art VAE and it relies on its EBM component to explicitly exclude non-data-like regions from the model and refine the image samples.

Image Generation Out-of-Distribution Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.