Semantic Segmentation

6002 papers with code • 147 benchmarks • 332 datasets

Semantic Segmentation is a computer vision task in which the goal is to categorize each pixel in an image into a class or object. The goal is to produce a dense pixel-wise segmentation map of an image, where each pixel is assigned to a specific class or object. Some example benchmarks for this task are Cityscapes, PASCAL VOC and ADE20K. Models are usually evaluated with the Mean Intersection-Over-Union (Mean IoU) and Pixel Accuracy metrics.

( Image credit: CSAILVision )

Libraries

Use these libraries to find Semantic Segmentation models and implementations

Most implemented papers

U-Net: Convolutional Networks for Biomedical Image Segmentation

labmlai/annotated_deep_learning_paper_implementations 18 May 2015

There is large consent that successful training of deep networks requires many thousand annotated training samples.

Deep Residual Learning for Image Recognition

tensorflow/models CVPR 2016

Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

Mask R-CNN

tensorflow/models ICCV 2017

Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance.

MobileNetV2: Inverted Residuals and Linear Bottlenecks

tensorflow/models CVPR 2018

In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes.

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

google-research/vision_transformer ICLR 2021

While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited.

MMDetection: Open MMLab Detection Toolbox and Benchmark

open-mmlab/mmdetection 17 Jun 2019

In this paper, we introduce the various features of this toolbox.

FCOS: Fully Convolutional One-Stage Object Detection

tianzhi0549/FCOS ICCV 2019

By eliminating the predefined set of anchor boxes, FCOS completely avoids the complicated computation related to anchor boxes such as calculating overlapping during training.

Feature Pyramid Networks for Object Detection

PaddlePaddle/PaddleOCR CVPR 2017

Feature pyramids are a basic component in recognition systems for detecting objects at different scales.

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

tensorflow/models ECCV 2018

The former networks are able to encode multi-scale contextual information by probing the incoming features with filters or pooling operations at multiple rates and multiple effective fields-of-view, while the latter networks can capture sharper object boundaries by gradually recovering the spatial information.