The Video-based Multimodal Summarization with Multimodal Output (VMSMO) corpus consists of 184,920 document-summary pairs, with 180,000 training pairs, 2,460 validation and test pairs. The task for this dataset is generating and appropriate textual summary of an article and choosing a proper cover frame from a video accompanying the article.

Source: https://github.com/yingtaomj/VMSMO

Papers


Paper Code Results Date Stars

Dataset Loaders


Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages