TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Image-to-Image Translation	ADE20K Labels-to-Photos	DP-SIMS (ConvNext-L)	mIoU	54.3	# 1
Image-to-Image Translation	ADE20K Labels-to-Photos	DP-SIMS (ConvNext-L)	FID	22.7	# 1
Image-to-Image Translation	Cityscapes Labels-to-Photo	DP-SIMS (ConvNext-L)	mIoU	76.3	# 1
Image-to-Image Translation	Cityscapes Labels-to-Photo	DP-SIMS (ConvNext-L)	FID	38.2	# 1
Image-to-Image Translation	COCO-Stuff Labels-to-Photos	DP-SIMS (ConvNext-XL)	FID	13.3	# 1
Image-to-Image Translation	COCO-Stuff Labels-to-Photos	DP-SIMS (ConvNext-L)	FID	13.6	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/unlocking-pre-trained-image-backbones-for/image-to-image-translation-on-ade20k-labels)](https://paperswithcode.com/sota/image-to-image-translation-on-ade20k-labels?p=unlocking-pre-trained-image-backbones-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/unlocking-pre-trained-image-backbones-for/image-to-image-translation-on-cityscapes)](https://paperswithcode.com/sota/image-to-image-translation-on-cityscapes?p=unlocking-pre-trained-image-backbones-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/unlocking-pre-trained-image-backbones-for/image-to-image-translation-on-coco-stuff)](https://paperswithcode.com/sota/image-to-image-translation-on-coco-stuff?p=unlocking-pre-trained-image-backbones-for)`

Unlocking Pre-trained Image Backbones for Semantic Image Synthesis

20 Dec 2023 · Tariq Berrada, Jakob Verbeek, Camille Couprie, Karteek Alahari ·

Semantic image synthesis, i.e., generating images from user-provided semantic label maps, is an important conditional image generation task as it allows to control both the content as well as the spatial layout of generated images. Although diffusion models have pushed the state of the art in generative image modeling, the iterative nature of their inference process makes them computationally demanding. Other approaches such as GANs are more efficient as they only need a single feed-forward pass for generation, but the image quality tends to suffer on large and diverse datasets. In this work, we propose a new class of GAN discriminators for semantic image synthesis that generates highly realistic images by exploiting feature backbone networks pre-trained for tasks such as image classification. We also introduce a new generator architecture with better context modeling and using cross-attention to inject noise into latent variables, leading to more diverse generated images. Our model, which we dub DP-SIMS, achieves state-of-the-art results in terms of image quality and consistency with the input label maps on ADE-20K, COCO-Stuff, and Cityscapes, surpassing recent diffusion models while requiring two orders of magnitude less compute for inference.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Conditional Image Generation

Image Classification

Image Generation

Image-to-Image Translation

Datasets

Cityscapes

ADE20K

COCO-Stuff

Results from the Paper

Add Remove

Ranked #1 on Image-to-Image Translation on ADE20K Labels-to-Photos

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Image-to-Image Translation	ADE20K Labels-to-Photos	DP-SIMS (ConvNext-L)	mIoU	54.3	# 1	Compare
Image-to-Image Translation	ADE20K Labels-to-Photos	DP-SIMS (ConvNext-L)	FID	22.7	# 1	Compare
Image-to-Image Translation	Cityscapes Labels-to-Photo	DP-SIMS (ConvNext-L)	mIoU	76.3	# 1	Compare
Image-to-Image Translation	Cityscapes Labels-to-Photo	DP-SIMS (ConvNext-L)	FID	38.2	# 1	Compare
Image-to-Image Translation	COCO-Stuff Labels-to-Photos	DP-SIMS (ConvNext-XL)	FID	13.3	# 1	Compare
Image-to-Image Translation	COCO-Stuff Labels-to-Photos	DP-SIMS (ConvNext-L)	FID	13.6	# 2	Compare

Methods

Add Remove

Diffusion

Edit Social Preview

Unlocking Pre-trained Image Backbones for Semantic Image Synthesis

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove