Grid Sensitive is a trick for object detection introduced by YOLOv4. When we decode the coordinate of the bounding box center $x$ and $y$, in original YOLOv3, we can get them by
$$ \begin{aligned} &x=s \cdot\left(g_{x}+\sigma\left(p_{x}\right)\right) \ &y=s \cdot\left(g_{y}+\sigma\left(p_{y}\right)\right) \end{aligned} $$
where $\sigma$ is the sigmoid function, $g_{x}$ and $g_{y}$ are integers and $s$ is a scale factor. Obviously, $x$ and $y$ cannot be exactly equal to $s \cdot g_{x}$ or $s \cdot\left(g_{x}+1\right)$. This makes it difficult to predict the centres of bounding boxes that just located on the grid boundary. We can address this problem, by changing the equation to
$$ \begin{aligned} &x=s \cdot\left(g_{x}+\alpha \cdot \sigma\left(p_{x}\right)-(\alpha-1) / 2\right) \ &y=s \cdot\left(g_{y}+\alpha \cdot \sigma\left(p_{y}\right)-(\alpha-1) / 2\right) \end{aligned} $$
This makes it easier for the model to predict bounding box center exactly located on the grid boundary. The FLOPs added by Grid Sensitive are really small, and can be totally ignored.
Source: YOLOv4: Optimal Speed and Accuracy of Object DetectionPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Object Detection | 53 | 27.04% |
Object | 29 | 14.80% |
Real-Time Object Detection | 9 | 4.59% |
Autonomous Driving | 5 | 2.55% |
Semantic Segmentation | 5 | 2.55% |
Domain Adaptation | 3 | 1.53% |
Image Classification | 3 | 1.53% |
Traffic Sign Detection | 3 | 1.53% |
Object Tracking | 3 | 1.53% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |