The ReferIt dataset contains 130,525 expressions for referring to 96,654 objects in 19,894 images of natural scenes.
Source: BiLingUNet: Image Segmentation by Modulating Top-Down and Bottom-Up Visual Processing with Referring ExpressionsPaper | Code | Results | Date | Stars |
---|