TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Thermal Image Segmentation	KP day-night	HAPNet	mIoU	57.6	# 1
Thermal Image Segmentation	MFN Dataset	HAPNet	mIOU	61.5	# 1
Semantic Segmentation	NYU Depth v2	HAPNet	Mean IoU	55.0	# 16
Semantic Segmentation	NYU Depth v2	HAPNet	Mean Accuracy	68.8	# 1
Thermal Image Segmentation	PST900	HAPNet	mIoU	89.0	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hapnet-toward-superior-rgb-thermal-scene/thermal-image-segmentation-on-kp-day-night)](https://paperswithcode.com/sota/thermal-image-segmentation-on-kp-day-night?p=hapnet-toward-superior-rgb-thermal-scene)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hapnet-toward-superior-rgb-thermal-scene/thermal-image-segmentation-on-mfn-dataset)](https://paperswithcode.com/sota/thermal-image-segmentation-on-mfn-dataset?p=hapnet-toward-superior-rgb-thermal-scene)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hapnet-toward-superior-rgb-thermal-scene/thermal-image-segmentation-on-pst900)](https://paperswithcode.com/sota/thermal-image-segmentation-on-pst900?p=hapnet-toward-superior-rgb-thermal-scene)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hapnet-toward-superior-rgb-thermal-scene/semantic-segmentation-on-nyu-depth-v2)](https://paperswithcode.com/sota/semantic-segmentation-on-nyu-depth-v2?p=hapnet-toward-superior-rgb-thermal-scene)`

HAPNet: Toward Superior RGB-Thermal Scene Parsing via Hybrid, Asymmetric, and Progressive Heterogeneous Feature Fusion

4 Apr 2024 · Jiahang Li, Peng Yun, Qijun Chen, Rui Fan ·

Data-fusion networks have shown significant promise for RGB-thermal scene parsing. However, the majority of existing studies have relied on symmetric duplex encoders for heterogeneous feature extraction and fusion, paying inadequate attention to the inherent differences between RGB and thermal modalities. Recent progress in vision foundation models (VFMs) trained through self-supervision on vast amounts of unlabeled data has proven their ability to extract informative, general-purpose features. However, this potential has yet to be fully leveraged in the domain. In this study, we take one step toward this new research area by exploring a feasible strategy to fully exploit VFM features for RGB-thermal scene parsing. Specifically, we delve deeper into the unique characteristics of RGB and thermal modalities, thereby designing a hybrid, asymmetric encoder that incorporates both a VFM and a convolutional neural network. This design allows for more effective extraction of complementary heterogeneous features, which are subsequently fused in a dual-path, progressive manner. Moreover, we introduce an auxiliary task to further enrich the local semantics of the fused features, thereby improving the overall performance of RGB-thermal scene parsing. Our proposed HAPNet, equipped with all these components, demonstrates superior performance compared to all other state-of-the-art RGB-thermal scene parsing networks, achieving top ranks across three widely used public RGB-thermal scene parsing datasets. We believe this new paradigm has opened up new opportunities for future developments in data-fusion scene parsing approaches.

PDF Abstract

Code

Add Remove Mark official

LiJiahang617/HAPNet official

Tasks

Add Remove

Scene Parsing

Semantic Segmentation

Thermal Image Segmentation

Datasets

NYUv2 MFNet

PST900

Results from the Paper

Edit

Ranked #1 on Thermal Image Segmentation on KP day-night

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Thermal Image Segmentation	KP day-night	HAPNet	mIoU	57.6	# 1	Compare
Thermal Image Segmentation	MFN Dataset	HAPNet	mIOU	61.5	# 1	Compare
Semantic Segmentation	NYU Depth v2	HAPNet	Mean IoU	55.0	# 16	Compare
Semantic Segmentation	NYU Depth v2	HAPNet	Mean Accuracy	68.8	# 1	Compare
Thermal Image Segmentation	PST900	HAPNet	mIoU	89.0	# 1	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

HAPNet: Toward Superior RGB-Thermal Scene Parsing via Hybrid, Asymmetric, and Progressive Heterogeneous Feature Fusion

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove