Scene Parsing
75 papers with code • 2 benchmarks • 4 datasets
Scene parsing is to segment and parse an image into different image regions associated with semantic categories, such as sky, road, person, and bed. MIT Description
Libraries
Use these libraries to find Scene Parsing models and implementationsSubtasks
Latest papers
BORM: Bayesian Object Relation Model for Indoor Scene Recognition
First, we utilize an improved object model (IOM) as a baseline that enriches the object knowledge by introducing a scene parsing algorithm pretrained on the ADE20K dataset with rich object categories related to the indoor scene.
Global Aggregation then Local Distribution for Scene Parsing
Modelling long-range contextual relationships is critical for pixel-wise prediction tasks such as semantic segmentation.
Resource Efficient Mountainous Skyline Extraction using Shallow Learning
We present a novel mountainous skyline detection approach where we adapt a shallow learning approach to learn a set of filters to discriminate between edges belonging to sky-mountain boundary and others coming from different regions.
Part-aware Panoptic Segmentation
In this work, we introduce the new scene understanding task of Part-aware Panoptic Segmentation (PPS), which aims to understand a scene at multiple levels of abstraction, and unifies the tasks of scene parsing and part parsing.
Fast and Accurate Scene Parsing via Bi-direction Alignment Networks
Motivated by this, we propose a novel network by aligning two-path information into each other through a learned flow field.
Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning
We further propose a novel geometry solving approach with formal language and symbolic reasoning, called Interpretable Geometry Problem Solver (Inter-GPS).
Editable Free-viewpoint Video Using a Layered Neural Representation
Such layered representation supports fully perception and realistic manipulation of the dynamic scene whilst still supporting a free viewing experience in a wide range.
3D-to-2D Distillation for Indoor Scene Parsing
First, we distill 3D knowledge from a pretrained 3D network to supervise a 2D network to learn simulated 3D features from 2D features during the training, so the 2D network can infer without requiring 3D data.
Evidential fully convolutional network for semantic segmentation
We propose a hybrid architecture composed of a fully convolutional network (FCN) and a Dempster-Shafer layer for image semantic segmentation.
AttaNet: Attention-Augmented Network for Fast and Accurate Scene Parsing
In this paper, we propose a new model, called Attention-Augmented Network (AttaNet), to capture both global context and multilevel semantics while keeping the efficiency high.