Scene Understanding

278 papers with code • 4 benchmarks • 39 datasets

Scene Understanding is something that to understand a scene. For instance, iPhone has function that help eye disabled person to take a photo by discribing what the camera sees. This is an example of Scene Understanding.


Use these libraries to find Scene Understanding models and implementations
4 papers
2 papers

Most implemented papers

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

PaddlePaddle/PaddleSeg 2 Nov 2015

We show that SegNet provides good performance with competitive inference time and more efficient inference memory-wise as compared to other architectures.

Microsoft COCO: Common Objects in Context

PaddlePaddle/PaddleDetection 1 May 2014

We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding.

Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding

alexgkendall/caffe-segnet 9 Nov 2015

Semantic segmentation is an important tool for visual scene understanding and a meaningful measure of uncertainty is essential for decision making.

Unified Perceptual Parsing for Scene Understanding

CSAILVision/unifiedparsing ECCV 2018

In this paper, we study a new task called Unified Perceptual Parsing, which requires the machine vision systems to recognize as many visual concepts as possible from a given image.

Digging Into Self-Supervised Monocular Depth Estimation

nianticlabs/monodepth2 4 Jun 2018

Per-pixel ground-truth depth data is challenging to acquire at scale.

LinkNet: Exploiting Encoder Representations for Efficient Semantic Segmentation

qubvel/segmentation_models 14 Jun 2017

As a result they are huge in terms of parameters and number of operations; hence slow too.

ResUNet-a: a deep learning framework for semantic segmentation of remotely sensed data

Nguyendat-bit/U-net 1 Apr 2019

Scene understanding of high resolution aerial images is of great importance for the task of automated monitoring in various remote sensing applications.

Spatial As Deep: Spatial CNN for Traffic Scene Understanding

XingangPan/SCNN 17 Dec 2017

Although CNN has shown strong capability to extract semantics from raw pixels, its capacity to capture spatial relationships of pixels across rows and columns of an image is not fully explored.

DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames

facebookresearch/habitat-api ICLR 2020

We leverage this scaling to train an agent for 2. 5 Billion steps of experience (the equivalent of 80 years of human experience) -- over 6 months of GPU-time training in under 3 days of wall-clock time with 64 GPUs.