53 papers with code • 3 benchmarks • 4 datasets
Scene segmentation is the task of splitting a scene into its various object components.
Image adapted from Temporally coherent 4D reconstruction of complex dynamic scenes.
We view this work as a notable step towards building a simple procedure to harness unlabeled video sequences and extra images to surpass state-of-the-art performance on core computer vision tasks.
Specifically, we append two types of attention modules on top of traditional dilated FCN, which model the semantic interdependencies in spatial and channel dimensions respectively.
Ranked #7 on Semantic Segmentation on COCO-Stuff test
We show that SegNet provides good performance with competitive inference time and more efficient inference memory-wise as compared to other architectures.
Ranked #3 on Scene Segmentation on SUN-RGBD
The computation cost and memory footprints of the voxel-based models grow cubically with the input resolution, making it memory-prohibitive to scale up the resolution.
Ranked #1 on 3D Semantic Segmentation on S3DIS (mIoU metric)
By viewing the indices as a function of the feature map, we introduce the concept of "learning to index", and present a novel index-guided encoder-decoder framework where indices are self-learned adaptively from data and are used to guide the downsampling and upsampling stages, without extra training supervision.
Ranked #1 on Scene Segmentation on SUN-RGBD