TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Unsupervised Video Object Segmentation	DAVIS 2016 val	D2Conv3D	G	86.0	# 6
Unsupervised Video Object Segmentation	DAVIS 2016 val	D2Conv3D	J	85.5	# 7
Unsupervised Video Object Segmentation	DAVIS 2016 val	D2Conv3D	F	86.5	# 8
Video Instance Segmentation	OVIS validation	D2Conv3D (ResNet-50)	mask AP	15.2	# 40
Video Instance Segmentation	OVIS validation	D2Conv3D (ResNet-50)	AP50	33.8	# 37
Video Instance Segmentation	OVIS validation	D2Conv3D (ResNet-50)	AP75	13.7	# 37

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/d2conv3d-dynamic-dilated-convolutions-for/unsupervised-video-object-segmentation-on-10)](https://paperswithcode.com/sota/unsupervised-video-object-segmentation-on-10?p=d2conv3d-dynamic-dilated-convolutions-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/d2conv3d-dynamic-dilated-convolutions-for/video-instance-segmentation-on-ovis-1)](https://paperswithcode.com/sota/video-instance-segmentation-on-ovis-1?p=d2conv3d-dynamic-dilated-convolutions-for)`

D2Conv3D: Dynamic Dilated Convolutions for Object Segmentation in Videos

WACV 2021 · Christian Schmidt, Ali Athar, Sabarinath Mahadevan, Bastian Leibe ·

Despite receiving significant attention from the research community, the task of segmenting and tracking objects in monocular videos still has much room for improvement. Existing works have simultaneously justified the efficacy of dilated and deformable convolutions for various image-level segmentation tasks. This gives reason to believe that 3D extensions of such convolutions should also yield performance improvements for video-level segmentation tasks. However, this aspect has not yet been explored thoroughly in existing literature. In this paper, we propose Dynamic Dilated Convolutions (D2Conv3D): a novel type of convolution which draws inspiration from dilated and deformable convolutions and extends them to the 3D (spatio-temporal) domain. We experimentally show that D2Conv3D can be used to improve the performance of multiple 3D CNN architectures across multiple video segmentation related benchmarks by simply employing D2Conv3D as a drop-in replacement for standard convolutions. We further show that D2Conv3D out-performs trivial extensions of existing dilated and deformable convolutions to 3D. Lastly, we set a new state-of-the-art on the DAVIS 2016 Unsupervised Video Object Segmentation benchmark. Code is made publicly available at https://github.com/Schmiddo/d2conv3d.

PDF Abstract

Code

Add Remove Mark official

schmiddo/d2conv3d

2023-MindSpore-4/Code5

Tasks

Add Remove

Multi-Object Tracking and Segmentation

Segmentation

Semantic Segmentation

Unsupervised Video Object Segmentation

Video Instance Segmentation

Video Object Segmentation

Video Segmentation

Video Semantic Segmentation

Datasets

KITTI

DAVIS

DAVIS 2016

YouTube-VIS 2019

OVIS

Results from the Paper

Add Remove

Ranked #6 on Unsupervised Video Object Segmentation on DAVIS 2016 val

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Unsupervised Video Object Segmentation	DAVIS 2016 val	D2Conv3D	G	86.0	# 6	Compare
			J	85.5	# 7	Compare
			F	86.5	# 8	Compare
Video Instance Segmentation	OVIS validation	D2Conv3D (ResNet-50)	mask AP	15.2	# 40	Compare
			AP50	33.8	# 37	Compare
			AP75	13.7	# 37	Compare

Methods

Add Remove

3D CNN • Convolution

Edit Social Preview

D2Conv3D: Dynamic Dilated Convolutions for Object Segmentation in Videos

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove