Adaptive Training Sample Selection, or ATSS, is a method to automatically select positive and negative samples according to statistical characteristics of object. It bridges the gap between anchor-based and anchor-free detectors.
For each ground-truth box $g$ on the image, we first find out its candidate positive samples. As described in Line $3$ to $6$, on each pyramid level, we select $k$ anchor boxes whose center are closest to the center of $g$ based on L2 distance. Supposing there are $\mathcal{L}$ feature pyramid levels, the ground-truth box $g$ will have $k\times\mathcal{L}$ candidate positive samples. After that, we compute the IoU between these candidates and the ground-truth $g$ as $\mathcal{D}_g$ in Line $7$, whose mean and standard deviation are computed as $m_g$ and $v_g$ in Line $8$ and Line $9$. With these statistics, the IoU threshold for this ground-truth $g$ is obtained as $t_g=m_g+v_g$ in Line $10$. Finally, we select these candidates whose IoU are greater than or equal to the threshold $t_g$ as final positive samples in Line $11$ to $15$.
Notably ATSS also limits the positive samples' center to the ground-truth box as shown in Line $12$. Besides, if an anchor box is assigned to multiple ground-truth boxes, the one with the highest IoU will be selected. The rest are negative samples.
Source: Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample SelectionPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Object Detection | 12 | 31.58% |
Object | 6 | 15.79% |
Instance Segmentation | 3 | 7.89% |
Semantic Segmentation | 3 | 7.89% |
Real-time Instance Segmentation | 2 | 5.26% |
Decoder | 2 | 5.26% |
Dense Object Detection | 2 | 5.26% |
Quantization | 1 | 2.63% |
Classification | 1 | 2.63% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |