PointRend

Last updated on Feb 19, 2021

PointRend (R101-FPN, 3×)

Parameters 79 Million
FLOPs 300 Billion
File Size 302.63 MB
Training Data MS COCO
Training Resources 8 NVIDIA V100 GPUs
Training Time

Training Techniques SGD with Momentum, Random Horizontal Flip, Weight Decay
Architecture PointRend, Mask R-CNN, FPN, ResNet
Max Iter 270000
Momentum 0.9
lr sched
Weight Decay 0.0001
FLOPs Input No 100
Backbone Layers 101
Output Resolution 224×224
SHOW MORE
SHOW LESS
PointRend (R50-FPN, 1×)

Parameters 60 Million
Backbone Layers 50
File Size 229.95 MB
Training Data MS COCO
Training Resources 8 NVIDIA V100 GPUs
Training Time

Training Techniques SGD with Momentum, Random Horizontal Flip, Weight Decay
Architecture PointRend, Mask R-CNN, FPN, ResNet
ID 164254221.0
Max Iter 90000
Momentum 0.9
lr sched
Weight Decay 0.0001
Backbone Layers 50
Output Resolution 224×224
SHOW MORE
SHOW LESS
PointRend (R50-FPN, 1×, Cityscapes)

Parameters 56 Million
FLOPs 464 Billion
File Size 214.44 MB
Training Data Cityscapes
Training Resources 8 NVIDIA V100 GPUs
Training Time

Training Techniques SGD with Momentum, Random Horizontal Flip, Weight Decay
Architecture PointRend, Mask R-CNN, FPN, ResNet
ID 164255101
LR 0.01
Max Iter 24000
Momentum 0.9
lr sched
Weight Decay 0.0001
FLOPs Input No 100
Backbone Layers 50
Output Resolution 224×224
SHOW MORE
SHOW LESS
PointRend (R50-FPN, 3×)

Parameters 60 Million
Backbone Layers 50
File Size 229.95 MB
Training Data MS COCO
Training Resources 8 NVIDIA V100 GPUs
Training Time

Training Techniques SGD with Momentum, Random Horizontal Flip, Weight Decay
Architecture PointRend, Mask R-CNN, FPN, ResNet
ID 164955410.0
Max Iter 270000
Momentum 0.9
lr sched
Weight Decay 0.0001
Backbone Layers 50
Output Resolution 224×224
SHOW MORE
SHOW LESS
PointRend (X101-FPN, 3×)

Parameters 123 Million
Backbone Layers 101
File Size 471.77 MB
Training Data MS COCO
Training Resources 8 NVIDIA V100 GPUs
Training Time

Training Techniques SGD with Momentum, Random Horizontal Flip, Weight Decay
Architecture PointRend, Mask R-CNN, FPN, ResNeXt
Max Iter 270000
Momentum 0.9
lr sched
Weight Decay 0.0001
Backbone Layers 101
Output Resolution 224×224
SHOW MORE
SHOW LESS
SemanticFPN + PointRend (R101-FPN)

Parameters 48 Million
Backbone Layers 101
File Size 182.36 MB
Training Data Cityscapes
Training Resources 8 NVIDIA V100 GPUs
Training Time

Training Techniques SGD with Momentum, Random Horizontal Flip, Weight Decay
Architecture PointRend, Mask R-CNN, FPN, SemanticFPN, ResNet
ID 202576688
LR 0.01
Max Iter 65000
Momentum 0.9
Weight Decay 0.0001
Backbone Layers 101
Output Resolution 1024×2048
SHOW MORE
SHOW LESS
README.md

Summary

PointRend is a module for image segmentation tasks, such as instance and semantic segmentation, that attempts to treat segmentation as image rending problem to efficiently "render" high-quality label maps. It uses a subdivision strategy to adaptively select a non-uniform set of points at which to compute labels. PointRend can be incorporated into popular meta-architectures for both instance segmentation (e.g. Mask R-CNN) and semantic segmentation (e.g. FCN). Its subdivision strategy efficiently computes high-resolution segmentation maps using an order of magnitude fewer floating-point operations than direct, dense computation. Most importantly, Faster R-CNN was not designed for pixel-to-pixel alignment between network inputs and outputs. This is evident in how RoIPool, the de facto core operation for attending to instances, performs coarse spatial quantization for feature extraction. To fix the misalignment, Mask R-CNN utilises a simple, quantization-free layer, called RoIAlign, that faithfully preserves exact spatial locations.

Quick start and visualization

This Colab Notebook tutorial contains examples of PointRend usage and visualizations of its point sampling stages.

Training

To train a model with 8 GPUs run:

cd /path/to/detectron2/projects/PointRend
python train_net.py --config-file configs/InstanceSegmentation/pointrend_rcnn_R_50_FPN_1x_coco.yaml --num-gpus 8

Evaluation

Model evaluation can be done similarly:

cd /path/to/detectron2/projects/PointRend
python train_net.py --config-file configs/InstanceSegmentation/pointrend_rcnn_R_50_FPN_1x_coco.yaml --eval-only MODEL.WEIGHTS /path/to/model_checkpoint

Citation

@InProceedings{kirillov2019pointrend,
  title={{PointRend}: Image Segmentation as Rendering},
  author={Alexander Kirillov and Yuxin Wu and Kaiming He and Ross Girshick},
  journal={ArXiv:1912.08193},
  year={2019}
}

Results

Instance Segmentation on COCO minival

Instance Segmentation on COCO minival
MODEL MASK AP
PointRend (X101-FPN, 3×) 41.1
PointRend (R101-FPN, 3×) 40.1
PointRend (R50-FPN, 3×) 38.3
PointRend (R50-FPN, 1×) 36.2
Semantic Segmentation on Cityscapes val
MODEL MIOU
SemanticFPN + PointRend (R101-FPN) 78.9
Instance Segmentation on Cityscapes val
MODEL MASK AP
PointRend (R50-FPN, 1×, Cityscapes) 35.9