Miscellaneous

Image-text Classification

6 papers with code • 0 benchmarks • 1 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Image-text Classification

You can find evaluation results in the subtasks. You can also submitting evaluation metrics for this task.

Datasets

FewSOL

Subtasks

Multilingual Image-Text Classification

Most implemented papers

Most implemented Social Latest No code

Context-Aware Compilation of DNN Training Pipelines across Edge and Cloud

dixiyao/Context-Aware-Compilation-of-DNN-Training-Pipelines-across-Edge-and-Cloud • • Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2021

Experimental results show that our system not only adapts well to, but also draws on the varying contexts, delivering a practical and efficient solution to edge-cloud model training.

Paper
Code

GLAMI-1M: A Multilingual Image-Text Fashion Dataset

glami/glami-1m • • BMVC 2022

We introduce GLAMI-1M: the largest multilingual image-text classification dataset and benchmark.

Paper
Code

DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion

qitianwu/difformer • • 23 Jan 2023

Real-world data generation often involves complex inter-dependencies among instances, violating the IID-data hypothesis of standard learning paradigms and posing a challenge for uncovering the geometric structures for learning desired instance representations.

Paper
Code

Towards Unifying Medical Vision-and-Language Pre-training via Soft Prompts

zhjohnchan/ptunifier • • ICCV 2023

Medical vision-and-language pre-training (Med-VLP) has shown promising improvements on many downstream medical tasks owing to its applicability to extracting generic representations from medical images and texts.

Paper
Code

UniS-MMC: Multimodal Classification via Unimodality-supervised Multimodal Contrastive Learning

vincent-zhq/unis-mmc • • 16 May 2023

Multimodal learning aims to imitate human beings to acquire complementary information from multiple modalities for various downstream tasks.

Paper
Code

GIST: Generating Image-Specific Text for Fine-grained Object Classification

emu1729/gist • • 21 Jul 2023

We demonstrate the utility of GIST by fine-tuning vision-language models on the image-and-generated-text pairs to learn an aligned vision-language representation space for improved classification.

Paper
Code

Image-text Classification

Benchmarks Add a Result

Datasets

Subtasks

Most implemented papers

Context-Aware Compilation of DNN Training Pipelines across Edge and Cloud

GLAMI-1M: A Multilingual Image-Text Fashion Dataset

DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion

Towards Unifying Medical Vision-and-Language Pre-training via Soft Prompts

UniS-MMC: Multimodal Classification via Unimodality-supervised Multimodal Contrastive Learning

GIST: Generating Image-Specific Text for Fine-grained Object Classification

Content

Benchmarks

Add a Result