Aerial Scene Classification
12 papers with code • 6 benchmarks • 1 datasets
Most implemented papers
Attention-based Deep Multiple Instance Learning
Multiple instance learning (MIL) is a variation of supervised learning where a single class label is assigned to a bag of instances.
An Empirical Study of Remote Sensing Pretraining
To this end, we train different networks from scratch with the help of the largest RS scene recognition dataset up to now -- MillionAID, to obtain a series of RS pretrained backbones, including both convolutional neural networks (CNN) and vision transformers such as Swin and ViTAE, which have shown promising performance on computer vision tasks.
Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model
Large-scale vision foundation models have made significant progress in visual tasks on natural images, with vision transformers being the primary choice due to their good scalability and representation ability.
MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining
However, transferring the pretrained models to downstream tasks may encounter task discrepancy due to their formulation of pretraining as image classification or object discrimination tasks.
AID: A Benchmark Dataset for Performance Evaluation of Aerial Scene Classification
The goal of AID is to advance the state-of-the-arts in scene classification of remote sensing images.
Deep-Learning-Based Aerial Image Classification for Emergency Response Applications Using Unmanned Aerial Vehicles
Unmanned Aerial Vehicles (UAVs), equipped with camera sensors can facilitate enhanced situational awareness for many emergency response and disaster management applications since they are capable of operating in remote and difficult to access areas.
A multiple-instance densely-connected ConvNet for aerial scene classification
It regards aerial scene classification as a multiple-instance learning problem so that local semantics can be further investigated.
Local semantic enhanced convnet for aerial scene recognition
Our LSE-Net consists of a context enhanced convolutional feature extractor, a local semantic perception module and a classification layer.
All Grains, One Scheme (AGOS): Learning Multi-grain Instance Representation for Aerial Scene Classification
Finally, our SSF allows our framework to learn the same scene scheme from multi-grain instance representations and fuses them, so that the entire framework is optimized as a whole.