2 dataset results for art AND Visual Reasoning AND Texts

…We use the game to collect 3.5K instances, finding that they are intuitive for humans (>90% Jaccard index) but challenging for state-of-the-art AI models, where the best model (ViLT) achieves a score of

4 PAPERS • 2 BENCHMARKS

SMART-101 (Simple Multimodal Algorithmic Reasoning Task Dataset)

…playing Go, generating art, ChatGPT, etc. Such a dramatic progress raises the question: how generalizable are neural networks in solving problems that demand broad skills?

2 PAPERS • NO BENCHMARKS YET

Datasets

2 dataset results for art AND Visual Reasoning AND Texts