Benchmarking and scaling of deep learning models for land cover image classification

The availability of the sheer volume of Copernicus Sentinel-2 imagery has created new opportunities for exploiting deep learning (DL) methods for land use land cover (LULC) image classification. However, an extensive set of benchmark experiments is currently lacking, i.e. DL models tested on the same dataset, with a common and consistent set of metrics, and in the same hardware. In this work, we use the BigEarthNet Sentinel-2 dataset to benchmark for the first time different state-of-the-art DL models for the multi-label, multi-class LULC image classification problem, contributing with an exhaustive zoo of 60 trained models. Our benchmark includes standard CNNs, as well as non-convolutional methods. We put to the test EfficientNets and Wide Residual Networks (WRN) architectures, and leverage classification accuracy, training time and inference rate. Furthermore, we propose to use the EfficientNet framework for the compound scaling of a lightweight WRN. Enhanced with an Efficient Channel Attention mechanism, our scaled lightweight model emerged as the new state-of-the-art. It achieves 4.5% higher averaged F-Score classification accuracy for all 19 LULC classes compared to a standard ResNet50 baseline model, with an order of magnitude less trainable parameters. We provide access to all trained models, along with our code for distributed training on multiple GPU nodes. This model zoo of pre-trained encoders can be used for transfer learning and rapid prototyping in different remote sensing tasks that use Sentinel-2 data, instead of exploiting backbone models trained with data from a different domain, e.g., from ImageNet. We validate their suitability for transfer learning in different datasets of diverse volumes. Our top-performing WRN achieves state-of-the-art performance (71.1% F-Score) on the SEN12MS dataset while being exposed to only a small fraction of the training dataset.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Multi-Label Image Classification BigEarthNet MLPMixer FScore 75.2 # 4
official split Yes # 1
Multi-Label Image Classification BigEarthNet ResNet50 FScore 76.8 # 3
official split Yes # 1
Multi-Label Image Classification BigEarthNet ViTM/20 FScore 77.1 # 2
official split Yes # 1
Multi-Label Image Classification BigEarthNet WideResNet-B5-ECA FScore 79 # 1
official split Yes # 1
Multi-Label Image Classification BigEarthNet (official test set) ResNet50 F1 Score 76.8 # 6

Methods