Wukong

Introduced by Gu et al. in Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark

Wukong is a large-scale Chinese cross-modal dataset for benchmarking different multi-modal pre-training methods to facilitate the Vision-Language Pre-training (VLP). This dataset contains 100 million Chinese image-text pairs from the web. This base query list is taken from and is filtered according to the frequency of Chinese words and phrases.

Homepage

Benchmarks

Add a new result Link an existing benchmark

No benchmarks yet. Start a new benchmark or link an existing one.

Papers

Paper	Code	Results	Date	Stars

Dataset Loaders

Add Remove

No data loaders found. You can submit your data loader here.

Tasks

Similar Datasets

TextCaps

XTD10

Flickr30k-CNA

COCO-CN

Usage

License

Unknown

Modalities

Images
Texts

Languages

Chinese