Recent CNN based object detectors, either one-stage methods like YOLO, SSD, and RetinaNet, or two-stage detectors like Faster R-CNN, R-FCN and FPN, are usually trying to directly finetune from ImageNet pre-trained models designed for the task of image classification. However, there has been little work discussing the backbone feature extractor specifically designed for the task of object detection... (read more)
PDF