Cap2Det: Learning to Amplify Weak Caption Supervision for Object Detection

ICCV 2019 Keren YeMingda ZhangAdriana KovashkaWei LiDanfeng QinJesse Berent

Learning to localize and name object instances is a fundamental problem in vision, but state-of-the-art approaches rely on expensive bounding box supervision. While weakly supervised detection (WSOD) methods relax the need for boxes to that of image-level annotations, even cheaper supervision is naturally available in the form of unstructured textual descriptions that users may freely provide when uploading image content... (read more)

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.