MFNet: Multi-Feature Fusion Network for Real-Time Semantic Segmentation in Road Scenes
Although high-accuracy networks have been applied to semantic segmentation at present, their inference speeds remain slow. A trade-off between accuracy and speed is demanded for real-time applications. To approach this problem, we propose Multi-Feature Fusion Network (MFNet) with real-time efficient prediction capacity. MFNet adopts three branches (attention, semantic and spatial information) to capture low-level and high-level features. Additionally, MFNet exerts asymmetric factorized (AF) blocks to extract local and long-range features. As a result, without any pre-training or post-processing, MFNet using only 1.34 M parameters, achieves 72.1% mean intersection over union (mIoU) on the Cityscapes test set at a speed of 116 frames per second (FPS), with 512×1024 high resolution on a single Titan Xp graphics card. Our network’s performance stands out from other state-of-the-art networks on four datasets (Cityscapes, CamVid, KITTI, and Gatech).
PDF