ICMR 2018

Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval

ICMR 2018 niluthpol/multimodal_vtt

Constructing a joint representation invariant across different modalities (e. g., video, language) is of significant importance in many multimedia applications.

VIDEO RETRIEVAL