TASK |
DATASET |
MODEL |
METRIC NAME |
METRIC VALUE |
GLOBAL RANK |
REMOVE |
Zero-Shot Transfer Image Classification
|
CN-ImageNet
|
AltCLIP
|
Accuracy (Private)
|
59.6
|
# 2
|
|
Zero-Shot Transfer Image Classification
|
CN-ImageNet-A
|
AltCLIP
|
Accuracy (Private)
|
58.5
|
# 1
|
|
Zero-Shot Transfer Image Classification
|
CN-ImageNet-R
|
AltCLIP
|
Accuracy (Private)
|
79.9
|
# 1
|
|
Zero-Shot Transfer Image Classification
|
CN-ImageNet-Sketch
|
AltCLIP
|
Accuracy (Private)
|
46.5
|
# 1
|
|
Zero-Shot Transfer Image Classification
|
CN-ImageNet V2
|
AltCLIP
|
Accuracy (Private)
|
50.9
|
# 1
|
|
Zero-Shot Cross-Modal Retrieval
|
Flickr30k
|
AltCLIP
|
Image-to-text R@1
|
86
|
# 15
|
|
Zero-Shot Cross-Modal Retrieval
|
Flickr30k
|
AltCLIP
|
Image-to-text R@5
|
98
|
# 16
|
|
Zero-Shot Cross-Modal Retrieval
|
Flickr30k
|
AltCLIP
|
Image-to-text R@10
|
99.1
|
# 14
|
|
Zero-Shot Cross-Modal Retrieval
|
Flickr30k
|
AltCLIP
|
Text-to-image R@1
|
72.5
|
# 15
|
|
Zero-Shot Cross-Modal Retrieval
|
Flickr30k
|
AltCLIP
|
Text-to-image R@5
|
91.6
|
# 14
|
|
Zero-Shot Cross-Modal Retrieval
|
Flickr30k
|
AltCLIP
|
Text-to-image R@10
|
95.4
|
# 12
|
|
Zero-shot Image Retrieval
|
Flickr30k-CN
|
AltCLIP
|
R@1
|
69.8
|
# 6
|
|
Zero-shot Image Retrieval
|
Flickr30k-CN
|
AltCLIP
|
R@5
|
89.9
|
# 7
|
|
Zero-shot Image Retrieval
|
Flickr30k-CN
|
AltCLIP
|
R@10
|
94.7
|
# 7
|
|
Zero-shot Text Retrieval
|
Flickr30k-CN
|
AltCLIP
|
R@1
|
84.8
|
# 2
|
|
Zero-shot Text Retrieval
|
Flickr30k-CN
|
AltCLIP
|
R@5
|
97.4
|
# 3
|
|
Zero-shot Text Retrieval
|
Flickr30k-CN
|
AltCLIP
|
R@10
|
98.8
|
# 3
|
|
Zero-shot Image Retrieval
|
Flickr30k-CN
|
AltCLIP(ViT-H/14)
|
R@1
|
74.5
|
# 4
|
|
Zero-shot Image Retrieval
|
Flickr30k-CN
|
AltCLIP(ViT-H/14)
|
R@5
|
92.0
|
# 4
|
|
Zero-shot Image Retrieval
|
Flickr30k-CN
|
AltCLIP(ViT-H/14)
|
R@10
|
95.5
|
# 4
|
|
Zero-shot Text Retrieval
|
Flickr30k-CN
|
Alt-CLIP(ViT-H/14)
|
R@1
|
88.9
|
# 1
|
|
Zero-shot Text Retrieval
|
Flickr30k-CN
|
Alt-CLIP(ViT-H/14)
|
R@5
|
98.5
|
# 1
|
|
Zero-shot Text Retrieval
|
Flickr30k-CN
|
Alt-CLIP(ViT-H/14)
|
R@10
|
99.5
|
# 1
|
|
Zero-Shot Transfer Image Classification
|
ImageNet
|
AltCLIP
|
Accuracy (Private)
|
74.5
|
# 19
|
|
Zero-Shot Transfer Image Classification
|
ImageNet-A
|
AltCLIP
|
Accuracy (Private)
|
69.5
|
# 12
|
|
Zero-Shot Transfer Image Classification
|
ImageNet-R
|
AltCLIP
|
Accuracy
|
87.2
|
# 11
|
|
Zero-Shot Transfer Image Classification
|
ImageNet-Sketch
|
AltCLIP
|
Accuracy (Private)
|
58.7
|
# 7
|
|
Zero-Shot Transfer Image Classification
|
ImageNet V2
|
AltCLIP
|
Accuracy (Private)
|
68.1
|
# 12
|
|
Zero-shot Image Retrieval
|
XTD10
|
AltCLIP(M9)
|
EN-Recall@10
|
95.4
|
# 3
|
|
Zero-shot Image Retrieval
|
XTD10
|
AltCLIP(M9)
|
ES-Recall@10
|
94.1
|
# 3
|
|
Zero-shot Image Retrieval
|
XTD10
|
AltCLIP(M9)
|
FR-Recall@10
|
92.9
|
# 3
|
|
Zero-shot Image Retrieval
|
XTD10
|
AltCLIP(M9)
|
ZH-Recall@10
|
95.1
|
# 3
|
|
Zero-shot Image Retrieval
|
XTD10
|
AltCLIP(M9)
|
KO-Recall@10
|
94.4
|
# 2
|
|
Zero-shot Image Retrieval
|
XTD10
|
AltCLIP(M9)
|
RU-Recall@10
|
91.8
|
# 3
|
|
Zero-shot Image Retrieval
|
XTD10
|
AltCLIP(M9)
|
JA-Recall@10
|
91.7
|
# 3
|
|
Zero-shot Image Retrieval
|
XTD10
|
AltCLIP(M9)
|
IT-Recall@10
|
94.2
|
# 3
|
|
Zero-shot Image Retrieval
|
XTD10
|
M-CLIP(ViT-B32)
|
EN-Recall@10
|
91.8
|
# 4
|
|
Zero-shot Image Retrieval
|
XTD10
|
M-CLIP(ViT-B32)
|
ES-Recall@10
|
89.1
|
# 4
|
|
Zero-shot Image Retrieval
|
XTD10
|
M-CLIP(ViT-B32)
|
FR-Recall@10
|
89.4
|
# 4
|
|
Zero-shot Image Retrieval
|
XTD10
|
M-CLIP(ViT-B32)
|
ZH-Recall@10
|
89.3
|
# 4
|
|
Zero-shot Image Retrieval
|
XTD10
|
M-CLIP(ViT-B32)
|
KO-Recall@10
|
82.1
|
# 4
|
|
Zero-shot Image Retrieval
|
XTD10
|
M-CLIP(ViT-B32)
|
RU-Recall@10
|
86.1
|
# 4
|
|
Zero-shot Image Retrieval
|
XTD10
|
M-CLIP(ViT-B32)
|
JA-Recall@10
|
81
|
# 4
|
|