no code implementations • 25 Mar 2017 • Guillem Collell, Teddy Zhang, Marie-Francine Moens
Integrating visual and linguistic information into a single multimodal representation is an unsolved problem with wide-reaching applications to both natural language processing and computer vision.