no code implementations • 1 Jun 2020 • Benet Oriol, Jordi Luque, Ferran Diego, Xavier Giro-i-Nieto
In this work, we propose an effective approach for training unique embedding representations by combining three simultaneous modalities: image and spoken and textual narratives.