GeoPython 2023

Semi supervised classification for aerial imagery
2023-03-08, 14:00–15:30, Auditorium

With the help of current semi-supervised learning algorithms, we can classify aerial scenes using 4, 20, or 40 labeled examples per class and still obtain similar accuracies as training with numerous labeled examples. With this workshop, we want to show the semi-supervised learning algorithms currently available and how to use the repositories available for scene classification.


In recent years, Deep learning (DL) algorithms applications have acquired notability in the geoscience and Remote Sensing (RS) community for big data analysis. Likewise, scene classification is increasing in interest among researchers due to the applications of DL to RS. Three DL classifiers are predominant in scene classification, these are Autoencoders, Convolutional Neural Networks (CNN) based, and Generative Adversarial Networks (GAN) based. Yet, CNNs require numerous annotated samples for either pre-trained, fine-tuned, or trained from scratch.

To tackle this issue, we use semi-supervised learning (SSL) algorithms that are designed to use both labeled and unlabeled data. The main idea is that training on small yet well-chosen samples can perform as efficiently as a predictor trained on a larger number chosen at random. We used two datasets based on aerial imagery, the UCM dataset (Yang, Y., 2010) which contains 21 classes with a total of 100 images of 256x256 pixels for each class, and the AID dataset (Xia, G., 2016) with 30 classes of about 200 to 400 samples per class with a size of 600x600 pixels in each class.

For the experiments, we adopted the CVPR2022 PyTorch implementation of the Class Aware Contrastive Semi-Supervised Learning (Yang, F., et al., 2022). We adapted the configuration files available with the selected datasets, backbone, augmentations, and other hyperparameters. We executed the following experiments: fully supervised with 50% of the training data and supervised and semi-supervised models (fixmatch, comatch, and fixmatch+CCSSL) for 4 / 25 / 40 samples for both UCM and AID.

During the workshop, we’ll go through the strategic use of git to get the repository, an explanation of the usage of the code, setting up the anaconda environment, and the training of the algorithm. We’ll finalize the workshop with the implementation of the best model for scene classification with semi-supervised learning.