TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Object Detection	A2D	RL [10] Lpixel	Mean IoU	5.8	# 1
Object Detection	COCO 2017	Lpixel	Mean mAP	4.2	# 2
Object Detection	SIXray	Optim [39] Lpixel	1 in 10 R@5	0.073	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/paint-transformer-feed-forward-neural/object-detection-on-a2d)](https://paperswithcode.com/sota/object-detection-on-a2d?p=paint-transformer-feed-forward-neural)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/paint-transformer-feed-forward-neural/object-detection-on-coco-2017)](https://paperswithcode.com/sota/object-detection-on-coco-2017?p=paint-transformer-feed-forward-neural)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/paint-transformer-feed-forward-neural/object-detection-on-sixray)](https://paperswithcode.com/sota/object-detection-on-sixray?p=paint-transformer-feed-forward-neural)`

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction

ICCV 2021 · Songhua Liu, Tianwei Lin, Dongliang He, Fu Li, Ruifeng Deng, Xin Li, Errui Ding, Hao Wang ·

Neural painting refers to the procedure of producing a series of strokes for a given image and non-photo-realistically recreating it using neural networks. While reinforcement learning (RL) based agents can generate a stroke sequence step by step for this task, it is not easy to train a stable RL agent. On the other hand, stroke optimization methods search for a set of stroke parameters iteratively in a large search space; such low efficiency significantly limits their prevalence and practicality. Different from previous methods, in this paper, we formulate the task as a set prediction problem and propose a novel Transformer-based framework, dubbed Paint Transformer, to predict the parameters of a stroke set with a feed forward network. This way, our model can generate a set of strokes in parallel and obtain the final painting of size 512 * 512 in near real time. More importantly, since there is no dataset available for training the Paint Transformer, we devise a self-training pipeline such that it can be trained without any off-the-shelf dataset while still achieving excellent generalization capability. Experiments demonstrate that our method achieves better painting performance than previous ones with cheaper training and inference costs. Codes and models are available.

PDF Abstract ICCV 2021 PDF ICCV 2021 Abstract

Code

Add Remove Mark official

huage001/painttransformer official

↳ Quickstart in

Colab

Spaces

477

wzmsltw/painttransformer official

306

Tasks

Add Remove

Object Detection

Reinforcement Learning (RL)

Style Transfer

Datasets

MS COCO

A2D

SIXray

Results from the Paper

Edit

Ranked #1 on Object Detection on A2D

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Object Detection	A2D	RL [10] Lpixel	Mean IoU	5.8	# 1	Compare
Object Detection	COCO 2017	Lpixel	Mean mAP	4.2	# 2	Compare
Object Detection	SIXray	Optim [39] Lpixel	1 in 10 R@5	0.073	# 2	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • Dropout • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove