While convolutions are known to be invariant to (discrete) translations, scaling continues to be a challenge and most image recognition networks are not invariant to them. To explore these effects, we have created the Scaled and Translated Image Recognition (STIR) dataset. This dataset contains objects of size $s \in [17, 64]$, each randomly placed in a $64 \times 64$ pixel image.
Paper | Code | Results | Date | Stars |
---|