Zero-Shot Cross-Lingual Visual Reasoning

2 papers with code • 1 benchmarks • 2 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Zero-Shot Cross-Lingual Visual Reasoning

Trend	Dataset	Best Model	Paper	Code	Compare
	MaRVL	CCLM-X2VLM-large			See all

Datasets

IGLUE
MaRVL

Most implemented papers

Most implemented Social Latest No code

Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-training

zengyan-97/cclm • • 1 Jun 2022

To this end, the cross-view language modeling framework considers both multi-modal data (i. e., image-caption pairs) and multi-lingual data (i. e., parallel sentence pairs) as two different views of the same object, and trains the model to align the two views by maximizing the mutual information between them with conditional masked language modeling and contrastive learning.

Paper
Code

Multilingual Multimodal Learning with Machine Translated Text

danoneata/td-mml • • 24 Oct 2022

We call this framework TD-MML: Translated Data for Multilingual Multimodal Learning, and it can be applied to any multimodal dataset and model.