Image Representations

Contrastive Language-Image Pre-training

Introduced by Radford et al. in Learning Transferable Visual Models From Natural Language Supervision

Contrastive Language-Image Pre-training (CLIP), consisting of a simplified version of ConVIRT trained from scratch, is an efficient method of image representation learning from natural language supervision.

Image credit: Learning Transferable Visual Models From Natural Language Supervision

Source: Learning Transferable Visual Models From Natural Language Supervision

Papers


Paper Code Results Date Stars

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories