Semantic Segmentation

5156 papers with code • 125 benchmarks • 311 datasets

Semantic Segmentation is a computer vision task in which the goal is to categorize each pixel in an image into a class or object. The goal is to produce a dense pixel-wise segmentation map of an image, where each pixel is assigned to a specific class or object. Some example benchmarks for this task are Cityscapes, PASCAL VOC and ADE20K. Models are usually evaluated with the Mean Intersection-Over-Union (Mean IoU) and Pixel Accuracy metrics.

( Image credit: CSAILVision )

Benchmarks

Add a Result

These leaderboards are used to track progress in Semantic Segmentation

Dataset	Best Model	Compare
ADE20K	ONE-PEACE	See all
NYU Depth v2	OmniVec	See all
Cityscapes test	VLTSeg	See all
ADE20K val	BEiT-3	See all
Cityscapes val	SERNet-Former	See all
PASCAL Context	PlainSeg (EVA-02-L)	See all
S3DIS	PTv3 + PPT	See all
S3DIS Area5	OmniVec	See all
PASCAL VOC 2012 test	DeepLabv3+ (Xception-65-JFT)	See all
SUN-RGBD	TokenFusion (S)	See all
DensePASS	Trans4PASS+ (multi-scale)	See all
ScanNet	PTv3 + PPT	See all
PASCAL VOC 2012 val	EfficientNet-L2+NAS-FPN (single scale test, with self-training)	See all
DADA-seg	MMUDA	See all
Stanford2D3D Panoramic	SFSS-MMSI (RGB+HHA)	See all
ImageNet-S	TEC (ViT-B/16, 224x224, SSL+FT, mmseg)	See all
LaRS	SWIM^2 (Mask2Former)	See all
CamVid	SERNet-Former	See all
COCO-Stuff test	EVA	See all
iSAID	SegNeXt-L	See all
Semantic3D	Feature Geometric Net	See all
ISPRS Potsdam	AerialFormer-B	See all
Trans10K	Trans4Trans (M)	See all
Dark Zurich	Refign (HRDA)	See all
KITTI-360	CMNeXt (RGB-D-E-LiDAR)	See all
MCubeS	MMSFormer (RGB-A-D-N)	See all
DeLiVER	CMNeXt (RGB-D-E-LiDAR)	See all
UrbanLF	CMNeXt (RGB-LF80)	See all
LIP val	Hulk(Finetune, ViT-L)	See all
ScanNetV2	CMX	See all
GTAV-to-Cityscapes Labels	MIC	See all
Nighttime Driving	TADP	See all
LoveDA	ViT-G12X4	See all
EventScape	CMX (B4)	See all
FMB Dataset	MMSFormer (RGB-Infrared)	See all
ISPRS Vaihingen	LSKNet-S	See all
SpaceNet 1	MAE+MTP(ViT-L)	See all
ZJU-RGB-P	ShareCMP (B4 RGB-FP)	See all
INRIA Aerial Image Labeling	UANet(PVT-V2-B2)	See all
LLRGBD-synthetic	SMMCL (SegNeXt-B)	See all
UPLight	ShareCMP (B2 RGB-FP)	See all
MCubeS (P)	MMSFormer (RGB-A-D)	See all
SpectralWaste	CMX (RGB-HYPER)	See all
DDD17	CMNeXt	See all
DSEC	CMNeXt	See all
KITTI Semantic Segmentation	RPVNet [xu2021rpvnet]	See all
SkyScapes-Dense	SkyScapesNet-Dense	See all
FoodSeg103	FoodSAM	See all
SYNTHIA-to-Cityscapes	HRDA + PiPa	See all
SynPASS	Trans4PASS+	See all
SELMA	CMX	See all
Pothole Mix	Baseline - DeepLabv3+	See all
DELIVER	CMNeXt (RGB-D-E-LiDAR)	See all
Mapillary val	AO-SegNet	See all
MS COCO	OneFormer (InternImage-H, emb_dim=1024, single-scale)	See all
Stanford2D3D - RGBD	CMX (SegFormer-B4)	See all
Event-based Segmentation Dataset	Bimodal SegNet	See all
GAMUS	TIMF	See all
ACDC Scribbles	ScribFormer	See all
ShapeNet	PatchFormer	See all
UAVid	LSKNet-S	See all
BIG	PSPNet + CascadePSP	See all
PETRAW	NCC Next	See all
Hypersim	MultiMAE (ViT-B)	See all
Structured3D	SFSS-MMSI (RGB+Depth+Normal)	See all
Matterport3D	SFSS-MMSI (RGB+Depth)	See all
CC3M-TagMask	TTD (TCL)	See all
PASCAL VOC 2011 test	Plugin network	See all
RELLIS-3D Dataset	GA-Nav	See all
PASTIS	Exchanger+Mask2Former	See all
SIFT-flow	RBE2E	See all
Stanford2D3D Panoramic - RGBD	CBFC	See all
Toronto-3D L002	SCF-Net	See all
Montgomery County X-ray Set	UNETR + SS-CXR	See all
dacl10k v1 testdev	FPN EfficientNet-B4 w/ Aux loss	See all
SYNTHIA-CVPR’16	SSMA	See all
Freiburg Forest	SSMA	See all
38-Cloud	Cloud-Net+	See all
PASCAL VOC 2007	GALDNet	See all
SkyScapes-Lane	SkyScapesNet-Lane	See all
Kvasir-Instrument	UNet	See all
Graz-02	VOLO-D5	See all
Cleargrasp (Novel)	Cleargrasp	See all
Cityscapes	SPFNet34M	See all
Endoscapes	MoCo V2 Surg SSL - DeepLabv3+ head	See all
HERA RFI Detection	Nearest Latent Neighbours	See all
LOFAR RFI Detection	Nearest Latent Neighbours	See all
BDD	FasterSeg	See all
COCO-Stuff	Deeplab v2	See all
Cam2BEV	uNetXST	See all
ApolloScape	ERFNet-IntRA-KD (ours)	See all
DroneDeploy	DLv3+ (Xception65)	See all
ManipalUAVid	UVid-Net	See all
Cityscapes VIPriors subset	EfficientSeg	See all
SBCoseg	Dice loss + IS-Triplet loss	See all
PASCAL VOC 2010 test	SIW	See all
PASCAL VOC 2012	DLDL-8s+CRF	See all
COCO-Stuff full	SegFormer-B5 (Single Scale)	See all
PASCAL VOC 2011	DLDL-8s+CRF	See all
AIRS	ICT-Net	See all
WildDash	SIW	See all
OpenEDS	RITnet	See all
SYNTHIA	CGA-Net	See all
PASCAL VOC	SegCLIP	See all
UTFPR-SBD3	EPYNET	See all
DIVA-HisDB	U-Net	See all
ATLANTIS	Erfani et al.	See all
PH2	MFSNet	See all
ISIC 2017	MFSNet	See all
HAM10000	MFSNet	See all
Mila Simulated Floods	FloodTransformer (Ours)	See all
SWIMSEG	ACLNet	See all
SWINSEG	ACLNet	See all
SWINySEG	ACLNet	See all
MixedWM38	WaferSegClassNet	See all
BDD100K val	NiseNet	See all
PASTIS-R	Late Fusion	See all
Cityscapes 3D	TaskPrompter	See all
FLAIR (French Land cover from Aerospace ImageRy)	U-Net baseline	See all
RUGD	GA-Nav	See all
dacl10k v1 testfinal	FPN EfficientNet-B4	See all
SemanticPOSS	TFNet	See all
COCO-Stuff-27	DiffSeg (512)	See all
Forward-Looking Sonar Marine Debris Datasets	Unet+RN34	See all
STARE	UNet	See all

Show all 125 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Semantic Segmentation models and implementations

PaddlePaddle/PaddleSeg

53 papers

8,216

rwightman/pytorch-image-models

33 papers

29,603

osmr/imgclsmob

30 papers

2,916

open-mmlab/mmsegmentation

19 papers

7,353

See all 39 libraries.

Datasets

Subtasks

Weakly-Supervised Semantic Segmentation

Scene Segmentation

Semi-Supervised Semantic Segmentation

Real-Time Semantic Segmentation

3D Part Segmentation

Unsupervised Semantic Segmentation

Road Segmentation

One-Shot Segmentation

Bird's-Eye View Semantic Segmentation

Crack Segmentation

UNET Segmentation

Universal Segmentation

Class-Incremental Semantic Segmentation

Polyp Segmentation

Vision-Language Segmentation

4D Spatio Temporal Semantic Segmentation

Histopathological Segmentation

Attentive segmentation networks

Text-Line Extraction

Aerial Video Semantic Segmentation

Amodal Panoptic Segmentation

Robust BEV Map Segmentation

Most implemented papers

Most implemented Social Latest No code

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

PaddlePaddle/PaddleSeg • • 2 Nov 2015

We show that SegNet provides good performance with competitive inference time and more efficient inference memory-wise as compared to other architectures.

Paper
Code

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

microsoft/Swin-Transformer • • ICCV 2021

This paper presents a new vision Transformer, called Swin Transformer, that capably serves as a general-purpose backbone for computer vision.

Paper
Code

Pyramid Scene Parsing Network

hszhao/PSPNet • • CVPR 2017

Scene parsing is challenging for unrestricted open vocabulary and diverse scenes.

Paper
Code

PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

yanx27/Pointnet_Pointnet2_pytorch • • NeurIPS 2017

By exploiting metric space distances, our network is able to learn local features with increasing contextual scales.

Paper
Code

Searching for MobileNetV3

tensorflow/models • • ICCV 2019

We achieve new state of the art results for mobile classification, detection and segmentation.

Paper
Code

Fully Convolutional Networks for Semantic Segmentation

pochih/fcn-pytorch • • CVPR 2015

Convolutional networks are powerful visual models that yield hierarchies of features.

Paper
Code

ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation

PaddlePaddle/PaddleSeg • • 7 Jun 2016

The ability to perform pixel-wise semantic segmentation in real-time is of paramount importance in mobile applications.

Paper
Code

Masked Autoencoders Are Scalable Vision Learners

facebookresearch/mae • • CVPR 2022

Our MAE approach is simple: we mask random patches of the input image and reconstruct the missing pixels.

Paper
Code

YOLACT: Real-time Instance Segmentation

dbolya/yolact • • ICCV 2019

Then we produce instance masks by linearly combining the prototypes with the mask coefficients.

Paper
Code

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

tensorflow/models • • 2 Jun 2016

ASPP probes an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views, thus capturing objects as well as image context at multiple scales.

Paper
Code

Semantic Segmentation

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result