A VQA model that marries two powerful ideas: probabilistic neural symbolic program execution for reasoning and a deep neural network with 3D generative representations of objects for robust visual scene parsing.
Source: 3D-Aware Visual Question Answering about Parts, Poses and OcclusionsPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Question Answering | 1 | 33.33% |
Visual Question Answering | 1 | 33.33% |
Visual Question Answering (VQA) | 1 | 33.33% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |