TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Action Segmentation	50 Salads	LTContext	F1@10%	89.4	# 4
Action Segmentation	50 Salads	LTContext	Edit	83.2	# 7
Action Segmentation	50 Salads	LTContext	Acc	87.7	# 6
Action Segmentation	50 Salads	LTContext	F1@25%	87.7	# 6
Action Segmentation	50 Salads	LTContext	F1@50%	82.0	# 5
Action Segmentation	Assembly101	LTContext	MoF	41.2	# 1
Action Segmentation	Assembly101	LTContext	F1@10%	33.9	# 1
Action Segmentation	Assembly101	LTContext	F1@25%	30.0	# 1
Action Segmentation	Assembly101	LTContext	F1@50%	22.6	# 1
Action Segmentation	Assembly101	LTContext	Edit	30.4	# 5
Action Segmentation	Breakfast	LTContext	F1@10%	77.6	# 7
Action Segmentation	Breakfast	LTContext	F1@50%	60.1	# 7
Action Segmentation	Breakfast	LTContext	Acc	74.2	# 9
Action Segmentation	Breakfast	LTContext	Edit	77.0	# 7
Action Segmentation	Breakfast	LTContext	F1@25%	72.6	# 7

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/how-much-temporal-long-term-context-is-needed/action-segmentation-on-assembly101)](https://paperswithcode.com/sota/action-segmentation-on-assembly101?p=how-much-temporal-long-term-context-is-needed)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/how-much-temporal-long-term-context-is-needed/action-segmentation-on-50-salads-1)](https://paperswithcode.com/sota/action-segmentation-on-50-salads-1?p=how-much-temporal-long-term-context-is-needed)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/how-much-temporal-long-term-context-is-needed/action-segmentation-on-breakfast-1)](https://paperswithcode.com/sota/action-segmentation-on-breakfast-1?p=how-much-temporal-long-term-context-is-needed)`

How Much Temporal Long-Term Context is Needed for Action Segmentation?

ICCV 2023 · Emad Bahrami, Gianpiero Francesca, Juergen Gall ·

Modeling long-term context in videos is crucial for many fine-grained tasks including temporal action segmentation. An interesting question that is still open is how much long-term temporal context is needed for optimal performance. While transformers can model the long-term context of a video, this becomes computationally prohibitive for long videos. Recent works on temporal action segmentation thus combine temporal convolutional networks with self-attentions that are computed only for a local temporal window. While these approaches show good results, their performance is limited by their inability to capture the full context of a video. In this work, we try to answer how much long-term temporal context is required for temporal action segmentation by introducing a transformer-based model that leverages sparse attention to capture the full context of a video. We compare our model with the current state of the art on three datasets for temporal action segmentation, namely 50Salads, Breakfast, and Assembly101. Our experiments show that modeling the full context of a video is necessary to obtain the best performance for temporal action segmentation.

PDF Abstract ICCV 2023 PDF ICCV 2023 Abstract

Code

Add Remove Mark official

ltcontext/ltcontext official

Tasks

Add Remove

Action Segmentation

Segmentation

Datasets

Breakfast

Assembly101 50 Salads

Results from the Paper

Edit

Ranked #1 on Action Segmentation on Assembly101

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Action Segmentation	50 Salads	LTContext	F1@10%	89.4	# 4	Compare
			Edit	83.2	# 7	Compare
			Acc	87.7	# 6	Compare
			F1@25%	87.7	# 6	Compare
			F1@50%	82.0	# 5	Compare
Action Segmentation	Assembly101	LTContext	MoF	41.2	# 1	Compare
			F1@10%	33.9	# 1	Compare
			F1@25%	30.0	# 1	Compare
			F1@50%	22.6	# 1	Compare
			Edit	30.4	# 5	Compare
Action Segmentation	Breakfast	LTContext	F1@10%	77.6	# 7	Compare
			F1@50%	60.1	# 7	Compare
			Acc	74.2	# 9	Compare
			Edit	77.0	# 7	Compare
			F1@25%	72.6	# 7	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

How Much Temporal Long-Term Context is Needed for Action Segmentation?

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove