A large scale Chinese multi-modal dialogue corpus (120.84K dialogues and 198.82 K images). MMCHAT contains image-grounded dialogues collected from real conversations on social media. We manually annotate 100K dialogues from MMCHAT with the dialogue quality and whether the dialogues are related to the given image. We provide the rule-filtered raw dialogues that are used to create MMChat (Rule Filtered Raw MMChat). It contains 4.257 M dialogue sessions and 4.874 M images We provide a version of MMChat that is filtered based on LCCC (LCCC Filtered MMChat). This version contain much cleaner dialogues (492.6 K dialogue sessions and 1.066 M images)
3 PAPERS • NO BENCHMARKS YET