TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Temporal Action Localization	ActivityNet-1.3	AVFusion	mAP IOU@0.5	54.34	# 10
Temporal Action Localization	ActivityNet-1.3	AVFusion	mAP	36.82	# 11
Temporal Action Localization	ActivityNet-1.3	AVFusion	mAP IOU@0.75	37.66	# 6
Temporal Action Localization	ActivityNet-1.3	AVFusion	mAP IOU@0.95	8.93	# 9
Temporal Action Localization	THUMOS'14	AVFusion	mAP IOU@0.5	57.18	# 1
Temporal Action Localization	THUMOS’14	AVFusion	mAP IOU@0.5	57.1	# 13
Temporal Action Localization	THUMOS’14	AVFusion	mAP IOU@0.3	70.1	# 12
Temporal Action Localization	THUMOS’14	AVFusion	mAP IOU@0.4	64.9	# 14
Temporal Action Localization	THUMOS’14	AVFusion	mAP IOU@0.6	45.4	# 17
Temporal Action Localization	THUMOS’14	AVFusion	mAP IOU@0.7	28.8	# 21
Temporal Action Localization	THUMOS’14	AVFusion	Avg mAP (0.3:0.7)	53.3	# 20

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hear-me-out-fusional-approaches-for-audio/temporal-action-localization-on-thumos-14)](https://paperswithcode.com/sota/temporal-action-localization-on-thumos-14?p=hear-me-out-fusional-approaches-for-audio)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hear-me-out-fusional-approaches-for-audio/temporal-action-localization-on-activitynet)](https://paperswithcode.com/sota/temporal-action-localization-on-activitynet?p=hear-me-out-fusional-approaches-for-audio)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hear-me-out-fusional-approaches-for-audio/temporal-action-localization-on-thumos14)](https://paperswithcode.com/sota/temporal-action-localization-on-thumos14?p=hear-me-out-fusional-approaches-for-audio)`

Hear Me Out: Fusional Approaches for Audio Augmented Temporal Action Localization

27 Jun 2021 · Anurag Bagchi, Jazib Mahmood, Dolton Fernandes, Ravi Kiran Sarvadevabhatla ·

State of the art architectures for untrimmed video Temporal Action Localization (TAL) have only considered RGB and Flow modalities, leaving the information-rich audio modality totally unexploited. Audio fusion has been explored for the related but arguably easier problem of trimmed (clip-level) action recognition. However, TAL poses a unique set of challenges. In this paper, we propose simple but effective fusion-based approaches for TAL. To the best of our knowledge, our work is the first to jointly consider audio and video modalities for supervised TAL. We experimentally show that our schemes consistently improve performance for state of the art video-only TAL approaches. Specifically, they help achieve new state of the art performance on large-scale benchmark datasets - ActivityNet-1.3 (54.34 mAP@0.5) and THUMOS14 (57.18 mAP@0.5). Our experiments include ablations involving multiple fusion schemes, modality combinations and TAL architectures. Our code, models and associated data are available at https://github.com/skelemoa/tal-hmo.

PDF Abstract

Code

Add Remove Mark official

skelemoa/tal-hmo official

Tasks

Add Remove

Action Localization

Action Recognition

Temporal Action Localization

Datasets

ActivityNet

THUMOS14 MUSES

Results from the Paper

Edit

Ranked #1 on Temporal Action Localization on THUMOS'14

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Temporal Action Localization	ActivityNet-1.3	AVFusion	mAP IOU@0.5	54.34	# 10	Compare
			mAP	36.82	# 11	Compare
			mAP IOU@0.75	37.66	# 6	Compare
			mAP IOU@0.95	8.93	# 9	Compare
Temporal Action Localization	THUMOS'14	AVFusion	mAP IOU@0.5	57.18	# 1	Compare
Temporal Action Localization	THUMOS’14	AVFusion	mAP IOU@0.5	57.1	# 13	Compare
			mAP IOU@0.3	70.1	# 12	Compare
			mAP IOU@0.4	64.9	# 14	Compare
			mAP IOU@0.6	45.4	# 17	Compare
			mAP IOU@0.7	28.8	# 21	Compare
			Avg mAP (0.3:0.7)	53.3	# 20	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Hear Me Out: Fusional Approaches for Audio Augmented Temporal Action Localization

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove