The M5Product dataset is a large-scale multi-modal pre-training dataset with coarse and fine-grained annotations for E-products.

• 6 Million multi-modal samples, 5k properties with 24 Million values

• 5 modalities-image text table video audio

• 6 Million category annotations with 6k classes

• Wide data source (1 Million merchants provide)

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages