Search Results for author: Wentian Zhao

Found 9 papers, 1 papers with code

Multi-modal Dependency Tree for Video Captioning

no code implementations NeurIPS 2021 Wentian Zhao, Xinxiao wu, Jiebo Luo

To this end, we propose a novel video captioning method that generates a sentence by first constructing a multi-modal dependency tree and then traversing the constructed tree, where the syntactic structure and semantic relationship in the sentence are represented by the tree topology.

Dependency Parsing Text Generation +1

Boosting Entity-aware Image Captioning with Multi-modal Knowledge Graph

no code implementations26 Jul 2021 Wentian Zhao, Yao Hu, HeDa Wang, Xinxiao wu, Jiebo Luo

Entity-aware image captioning aims to describe named entities and events related to the image by utilizing the background knowledge in the associated article.

Graph Attention Image Captioning

MemCap: Memorizing Style Knowledge for Image Captioning

1 code implementation AAAI 2020 Wentian Zhao, Xinxiao wu, Xiaoxun Zhang

Generating stylized captions for images is a challenging task since it requires not only describing the content of the image accurately but also expressing the desired linguistic style appropriately.

Image Captioning Language Modelling

Improve CAM with Auto-adapted Segmentation and Co-supervised Augmentation

no code implementations17 Nov 2019 Ziyi Kou, Guofeng Cui, Shaojie Wang, Wentian Zhao, Chenliang Xu

In this paper, we propose a confidence segmentation (ConfSeg) module that builds confidence score for each pixel in CAM without introducing additional hyper-parameters.

Weakly-Supervised Object Localization

Weakly Supervised Localization Using Background Images

no code implementations9 Sep 2019 Ziyi Kou, Wentian Zhao, Guofeng Cui, Shaojie Wang

Weakly Supervised Object Localization (WSOL) methodsusually rely on fully convolutional networks in order to ob-tain class activation maps(CAMs) of targeted labels.

Weakly-Supervised Object Localization

Relational Reasoning using Prior Knowledge for Visual Captioning

no code implementations4 Jun 2019 Jingyi Hou, Xinxiao Wu, Yayun Qi, Wentian Zhao, Jiebo Luo, Yunde Jia

Extensive experiments on the MS-COCO image captioning benchmark and the MSVD video captioning benchmark validate the superiority of our method on leveraging prior commonsense knowledge to enhance relational reasoning for visual captioning.

Image Captioning object-detection +3

GAN-EM: GAN based EM learning framework

no code implementations2 Dec 2018 Wentian Zhao, Shaojie Wang, Zhihuai Xie, Jing Shi, Chenliang Xu

To overcome such limitation, we propose a GAN based EM learning framework that can maximize the likelihood of images and estimate the latent variables with only the constraint of L-Lipschitz continuity.

Dimensionality Reduction General Classification +1

How to Make a BLT Sandwich? Learning to Reason towards Understanding Web Instructional Videos

no code implementations2 Dec 2018 Shaojie Wang, Wentian Zhao, Ziyi Kou, Chenliang Xu

Furthermore, we study multiple modalities including description and transcripts for the purpose of boosting video understanding.

Logical Reasoning Question Answering +1

Cannot find the paper you are looking for? You can Submit a new open access paper.