Object Detection Models

Sparse R-CNN is a purely sparse method for object detection in images, without object positional candidates enumerating on all(dense) image grids nor object queries interacting with global(dense) image feature.

As shown in the Figure, object candidates are given with a fixed small set of learnable bounding boxes represented by 4-d coordinate. For the example of the COCO dataset, 100 boxes and 400 parameters are needed in total, rather than the predicted ones from hundreds of thousands of candidates in a Region Proposal Network (RPN). These sparse candidates are used as proposal boxes to extract the feature of Region of Interest (RoI) by RoIPool or RoIAlign.

Source: Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Papers


Paper Code Results Date Stars

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories