no code implementations • 20 May 2023 • Yufeng He, Zefan Cai, Xu Gan, Baobao Chang
Our method transforms discrete tokens in a natural way and applies continuous diffusion on them to successfully fuse extracted image features for diffusion caption generation.