TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Moment Queries	Ego4D	InternVideo	Avg mAP (0.1-0.5)	23.59	# 2
Moment Queries	Ego4D	InternVideo	Recall	41.13	# 3
State Change Object Detection	Ego4D	InternVideo	AP	37.19	# 1
State Change Object Detection	Ego4D	InternVideo	AP50	55.97	# 1
State Change Object Detection	Ego4D	InternVideo	AP75	38.44	# 1
Short-term Object Interaction Anticipation	Ego4D	InternVideo	Overall (Top5 mAP)	3.4	# 2
Short-term Object Interaction Anticipation	Ego4D	InternVideo	Noun (Top5 mAP)	24.6	# 1
Short-term Object Interaction Anticipation	Ego4D	InternVideo	Noun+Verb(Top5 mAP)	9.18	# 2
Short-term Object Interaction Anticipation	Ego4D	InternVideo	Noun+TTC (Top5 mAP)	7.64	# 1
Future Hand Prediction	Ego4D	InternVideo	M.Disp(Left)	43.25	# 1
Future Hand Prediction	Ego4D	InternVideo	C.Disp(Left)	53.33	# 1
Future Hand Prediction	Ego4D	InternVideo	M.Disp(Right)	46.25	# 1
Future Hand Prediction	Ego4D	InternVideo	C.Disp(Right)	53.37	# 1
Future Hand Prediction	Ego4D	InternVideo	Disp(Total)	196.8	# 1
Natural Language Queries	Ego4D	InternVideo	R@1 IoU=0.3	16.45	# 1
Natural Language Queries	Ego4D	InternVideo	R@5 IoU=0.3	22.95	# 3
Natural Language Queries	Ego4D	InternVideo	R@1 IoU=0.5	10.06	# 1
Natural Language Queries	Ego4D	InternVideo	R@5 IoU=0.5	16.10	# 3
Natural Language Queries	Ego4D	InternVideo	R@1 Mean(0.3 and 0.5)	13.26	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/internvideo-ego4d-a-pack-of-champion/state-change-object-detection-on-ego4d)](https://paperswithcode.com/sota/state-change-object-detection-on-ego4d?p=internvideo-ego4d-a-pack-of-champion)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/internvideo-ego4d-a-pack-of-champion/future-hand-prediction-on-ego4d)](https://paperswithcode.com/sota/future-hand-prediction-on-ego4d?p=internvideo-ego4d-a-pack-of-champion)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/internvideo-ego4d-a-pack-of-champion/natural-language-queries-on-ego4d)](https://paperswithcode.com/sota/natural-language-queries-on-ego4d?p=internvideo-ego4d-a-pack-of-champion)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/internvideo-ego4d-a-pack-of-champion/moment-queries-on-ego4d)](https://paperswithcode.com/sota/moment-queries-on-ego4d?p=internvideo-ego4d-a-pack-of-champion)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/internvideo-ego4d-a-pack-of-champion/short-term-object-interaction-anticipation-on)](https://paperswithcode.com/sota/short-term-object-interaction-anticipation-on?p=internvideo-ego4d-a-pack-of-champion)`

InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D Challenges

17 Nov 2022 · Guo Chen, Sen Xing, Zhe Chen, Yi Wang, Kunchang Li, Yizhuo Li, Yi Liu, Jiahao Wang, Yin-Dong Zheng, Bingkun Huang, Zhiyu Zhao, Junting Pan, Yifei HUANG, Zun Wang, Jiashuo Yu, Yinan He, Hongjie Zhang, Tong Lu, Yali Wang, LiMin Wang, Yu Qiao ·

In this report, we present our champion solutions to five tracks at Ego4D challenge. We leverage our developed InternVideo, a video foundation model, for five Ego4D tasks, including Moment Queries, Natural Language Queries, Future Hand Prediction, State Change Object Detection, and Short-term Object Interaction Anticipation. InternVideo-Ego4D is an effective paradigm to adapt the strong foundation model to the downstream ego-centric video understanding tasks with simple head designs. In these five tasks, the performance of InternVideo-Ego4D comprehensively surpasses the baseline methods and the champions of CVPR2022, demonstrating the powerful representation ability of InternVideo as a video foundation model. Our code will be released at https://github.com/OpenGVLab/ego4d-eccv2022-solutions

PDF Abstract

Code

Add Remove Mark official

opengvlab/ego4d-eccv2022-solutions official

jonnys1226/ego4d_asl

Tasks

Add Remove

Future Hand Prediction

Moment Queries

Natural Language Queries

Object

object-detection

Object Detection

Short-term Object Interaction Anticipation

State Change Object Detection

Video Understanding

Datasets

Ego4D

Results from the Paper

Edit

Ranked #1 on State Change Object Detection on Ego4D

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Moment Queries	Ego4D	InternVideo	Avg mAP (0.1-0.5)	23.59	# 2	Compare
Moment Queries	Ego4D	InternVideo	Recall	41.13	# 3	Compare
State Change Object Detection	Ego4D	InternVideo	AP	37.19	# 1	Compare
			AP50	55.97	# 1	Compare
			AP75	38.44	# 1	Compare
Short-term Object Interaction Anticipation	Ego4D	InternVideo	Overall (Top5 mAP)	3.4	# 2	Compare
			Noun (Top5 mAP)	24.6	# 1	Compare
			Noun+Verb(Top5 mAP)	9.18	# 2	Compare
			Noun+TTC (Top5 mAP)	7.64	# 1	Compare
Future Hand Prediction	Ego4D	InternVideo	M.Disp(Left)	43.25	# 1	Compare
			C.Disp(Left)	53.33	# 1	Compare
			M.Disp(Right)	46.25	# 1	Compare
			C.Disp(Right)	53.37	# 1	Compare
			Disp(Total)	196.8	# 1	Compare
Natural Language Queries	Ego4D	InternVideo	R@1 IoU=0.3	16.45	# 1	Compare
			R@5 IoU=0.3	22.95	# 3	Compare
			R@1 IoU=0.5	10.06	# 1	Compare
			R@5 IoU=0.5	16.10	# 3	Compare
			R@1 Mean(0.3 and 0.5)	13.26	# 1	Compare

Methods

Add Remove

InternVideo

Edit Social Preview

InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D Challenges

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove