no code implementations • 25 Apr 2020 • Amine Kechaou, Manuel Martinez, Monica Haurilet, Rainer Stiefelhagen
At each iteration, our decoder focuses on the relevant parts of the image using an attention mechanism, and then estimates the object's class and the bounding box coordinates.