no code implementations • ACM Multimedia Asia 2022 • Ruichao Fan, Hanli Wang, Jinjing Gu, and Xianhui Liu
As there is no ground-truth topic information, a pre-trained BERT model based on visual contents and annotated stories is utilized to mine topics.
Ranked #2 on Visual Storytelling on VIST