TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Object Detection	ELEVATER	GLIP-T	AP	62.6	# 1
Zero-Shot Image Classification	ICinW	CLIP (ViT B-32)	Average Score	56.64	# 1
Zero-Shot Image Classification	ODinW	GLIP (Tiny A)	Average Score	11.4	# 1
Zero-Shot Object Detection	ODinW	GLIP (Tiny A)	Average Score	11.4	# 3
Object Detection	ODinW Full-shot 35 Tasks	GLIP-T	AP	62.6	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/elevater-a-benchmark-and-toolkit-for/object-detection-on-elevater)](https://paperswithcode.com/sota/object-detection-on-elevater?p=elevater-a-benchmark-and-toolkit-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/elevater-a-benchmark-and-toolkit-for/zero-shot-image-classification-on-icinw)](https://paperswithcode.com/sota/zero-shot-image-classification-on-icinw?p=elevater-a-benchmark-and-toolkit-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/elevater-a-benchmark-and-toolkit-for/zero-shot-image-classification-on-odinw)](https://paperswithcode.com/sota/zero-shot-image-classification-on-odinw?p=elevater-a-benchmark-and-toolkit-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/elevater-a-benchmark-and-toolkit-for/object-detection-on-odinw-full-shot-35-tasks)](https://paperswithcode.com/sota/object-detection-on-odinw-full-shot-35-tasks?p=elevater-a-benchmark-and-toolkit-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/elevater-a-benchmark-and-toolkit-for/zero-shot-object-detection-on-odinw)](https://paperswithcode.com/sota/zero-shot-object-detection-on-odinw?p=elevater-a-benchmark-and-toolkit-for)`

ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models

19 Apr 2022 · Chunyuan Li, Haotian Liu, Liunian Harold Li, Pengchuan Zhang, Jyoti Aneja, Jianwei Yang, Ping Jin, Houdong Hu, Zicheng Liu, Yong Jae Lee, Jianfeng Gao ·

Learning visual representations from natural language supervision has recently shown great promise in a number of pioneering works. In general, these language-augmented visual models demonstrate strong transferability to a variety of datasets and tasks. However, it remains challenging to evaluate the transferablity of these models due to the lack of easy-to-use evaluation toolkits and public benchmarks. To tackle this, we build ELEVATER (Evaluation of Language-augmented Visual Task-level Transfer), the first benchmark and toolkit for evaluating(pre-trained) language-augmented visual models. ELEVATER is composed of three components. (i) Datasets. As downstream evaluation suites, it consists of 20 image classification datasets and 35 object detection datasets, each of which is augmented with external knowledge. (ii) Toolkit. An automatic hyper-parameter tuning toolkit is developed to facilitate model evaluation on downstream tasks. (iii) Metrics. A variety of evaluation metrics are used to measure sample-efficiency (zero-shot and few-shot) and parameter-efficiency (linear probing and full model fine-tuning). ELEVATER is a platform for Computer Vision in the Wild (CVinW), and is publicly released at at https://computer-vision-in-the-wild.github.io/ELEVATER/

PDF Abstract

Code

Add Remove Mark official

computer-vision-in-the-wild/cvinw_r… official

↳ Quickstart in

Spaces

1,018

Computer-Vision-in-the-Wild/Elevate… official

microsoft/GLIP

↳ Quickstart in

Colab

Spaces

1,983

microsoft/esvit

403

microsoft/unicl

↳ Quickstart in

Spaces

369

See all 8 implementations

Tasks

Add Remove

Fairness

Few-Shot Image Classification

Few-Shot Object Detection

Image Classification

object-detection

Object Detection

Zero-Shot Image Classification

Zero-Shot Object Detection

Datasets

Introduced in the Paper:

ELEVATER

Used in the Paper:

ImageNet

MS COCO

CUB-200-2011

LVIS

AwA

aPY

Results from the Paper

Edit

Ranked #1 on Object Detection on ELEVATER

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Object Detection	ELEVATER	GLIP-T	AP	62.6	# 1	Compare
Zero-Shot Image Classification	ICinW	CLIP (ViT B-32)	Average Score	56.64	# 1	Compare
Zero-Shot Image Classification	ODinW	GLIP (Tiny A)	Average Score	11.4	# 1	Compare
Zero-Shot Object Detection	ODinW	GLIP (Tiny A)	Average Score	11.4	# 3	Compare
Object Detection	ODinW Full-shot 35 Tasks	GLIP-T	AP	62.6	# 1	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove