no code implementations • 10 Mar 2022 • Junjie Shen, Ningfei Wang, Ziwen Wan, Yunpeng Luo, Takami Sato, Zhisheng Hu, Xinyang Zhang, Shengjian Guo, Zhenyu Zhong, Kang Li, Ziming Zhao, Chunming Qiao, Qi Alfred Chen
In this paper, we perform the first systematization of knowledge of such growing semantic AD AI security research space.
Then, we build a BERTbased language model to extract language context and propose Adaptive-Attention (AA) module on top of a transformer decoder to adaptively measure the contribution of visual and language cues before making decisions for word prediction.
Descriptive region features extracted by object detection networks have played an important role in the recent advancements of image captioning.
The latter contains a Global Adaptive Controller that can adaptively fuse the global information into the decoder to guide the caption generation.