Zero-Shot Cross-Lingual Visual Reasoning

2 papers with code • 1 benchmarks • 2 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-training

zengyan-97/cclm 1 Jun 2022

To this end, the cross-view language modeling framework considers both multi-modal data (i. e., image-caption pairs) and multi-lingual data (i. e., parallel sentence pairs) as two different views of the same object, and trains the model to align the two views by maximizing the mutual information between them with conditional masked language modeling and contrastive learning.

Multilingual Multimodal Learning with Machine Translated Text

danoneata/td-mml 24 Oct 2022

We call this framework TD-MML: Translated Data for Multilingual Multimodal Learning, and it can be applied to any multimodal dataset and model.