Real-time semantic segmentation is of significant importance for mobile and
robotics related applications. We propose a computationally efficient
segmentation network which we term as ShuffleSeg...
The proposed architecture is
based on grouped convolution and channel shuffling in its encoder for improving
the performance. An ablation study of different decoding methods is compared
including Skip architecture, UNet, and Dilation Frontend. Interesting insights
on the speed and accuracy tradeoff is discussed. It is shown that skip
architecture in the decoding method provides the best compromise for the goal
of real-time performance, while it provides adequate accuracy by utilizing
higher resolution feature maps for a more accurate segmentation. ShuffleSeg is
evaluated on CityScapes and compared against the state of the art real-time
segmentation networks. It achieves 2x GFLOPs reduction, while it provides on
par mean intersection over union of 58.3% on CityScapes test set. ShuffleSeg
runs at 15.7 frames per second on NVIDIA Jetson TX2, which makes it of great
potential for real-time applications.