TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Scene Recognition	MIT Indoor Scenes	FOSNet	Accuracy	90.3	# 1
Scene Recognition	Places365	FOSNet	Top 1 Accuracy	60.14	# 1
Scene Recognition	Places365	FOSNet	Top 5 Accuracy	88.86	# 1
Scene Recognition	SUN397	FOSNet	Accuracy	77.28	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/fosnet-an-end-to-end-trainable-deep-neural/scene-recognition-on-mit-indoors-scenes)](https://paperswithcode.com/sota/scene-recognition-on-mit-indoors-scenes?p=fosnet-an-end-to-end-trainable-deep-neural)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/fosnet-an-end-to-end-trainable-deep-neural/scene-recognition-on-places365)](https://paperswithcode.com/sota/scene-recognition-on-places365?p=fosnet-an-end-to-end-trainable-deep-neural)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/fosnet-an-end-to-end-trainable-deep-neural/scene-recognition-on-sun397)](https://paperswithcode.com/sota/scene-recognition-on-sun397?p=fosnet-an-end-to-end-trainable-deep-neural)`

FOSNet: An End-to-End Trainable Deep Neural Network for Scene Recognition

17 Jul 2019 · Hongje Seong, Junhyuk Hyun, Euntai Kim ·

Scene recognition is an image recognition problem aimed at predicting the category of the place at which the image is taken. In this paper, a new scene recognition method using the convolutional neural network (CNN) is proposed. The proposed method is based on the fusion of the object and the scene information in the given image and the CNN framework is named as FOS (fusion of object and scene) Net. In addition, a new loss named scene coherence loss (SCL) is developed to train the FOSNet and to improve the scene recognition performance. The proposed SCL is based on the unique traits of the scene that the 'sceneness' spreads and the scene class does not change all over the image. The proposed FOSNet was experimented with three most popular scene recognition datasets, and their state-of-the-art performance is obtained in two sets: 60.14% on Places 2 and 90.37% on MIT indoor 67. The second highest performance of 77.28% is obtained on SUN 397.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Scene Recognition

Datasets

Places

Places205

Places365

SUN397 MIT Indoor Scenes

Results from the Paper

Edit

Ranked #1 on Scene Recognition on MIT Indoor Scenes

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Scene Recognition	MIT Indoor Scenes	FOSNet	Accuracy	90.3	# 1	Compare
Scene Recognition	Places365	FOSNet	Top 1 Accuracy	60.14	# 1	Compare
Scene Recognition	Places365	FOSNet	Top 5 Accuracy	88.86	# 1	Compare
Scene Recognition	SUN397	FOSNet	Accuracy	77.28	# 1	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

FOSNet: An End-to-End Trainable Deep Neural Network for Scene Recognition

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove