One-Stage Object Detection Models

YOLOP is a panoptic driving perception network for handling traffic object detection, drivable area segmentation and lane detection simultaneously. It is composed of one encoder for feature extraction and three decoders to handle the specific tasks. It can be thought of a lightweight version of Tesla's HydraNet model for self-driving cars.

A lightweight CNN, from Scaled-yolov4, is used as the encoder to extract features from the image. Then these feature maps are fed to three decoders to complete their respective tasks. The detection decoder is based on the current best-performing single-stage detection network, YOLOv4, for two main reasons: (1) The single-stage detection network is faster than the two-stage detection network. (2) The grid-based prediction mechanism of the single-stage detector is more related to the other two semantic segmentation tasks, while instance segmentation is usually combined with the region based detector as in Mask R-CNN. The feature map output by the encoder incorporates semantic features of different levels and scales, and our segmentation branch can use these feature maps to complete pixel-wise semantic prediction.

Source: YOLOP: You Only Look Once for Panoptic Driving Perception

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Object Detection 2 28.57%
Autonomous Vehicles 1 14.29%
Autonomous Driving 1 14.29%
Drivable Area Detection 1 14.29%
Lane Detection 1 14.29%
Multi-Task Learning 1 14.29%

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories