Search Results for author: Lanxin Li

Found 2 papers, 2 papers with code

ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language Generation

2 code implementations31 Dec 2021 Han Zhang, Weichong Yin, Yewei Fang, Lanxin Li, Boqiang Duan, Zhihua Wu, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

To explore the landscape of large-scale pre-training for bidirectional text-image generation, we train a 10-billion parameter ERNIE-ViLG model on a large-scale dataset of 145 million (Chinese) image-text pairs which achieves state-of-the-art performance for both text-to-image and image-to-text tasks, obtaining an FID of 7. 9 on MS-COCO for text-to-image synthesis and best results on COCO-CN and AIC-ICC for image captioning.

Image Captioning Quantization +2

Cannot find the paper you are looking for? You can Submit a new open access paper.