Roy, Kaushik et Homayouni, Saeid ORCID: https://orcid.org/0000-0002-0214-5356
(2025).
High-Resolution Urban Land Cover Mapping from Satellite Imagery Using Deep Learning
Institut national de la recherche scientifique, Québec.
Prévisualisation |
PDF
Télécharger (34MB) | Prévisualisation |
Résumé
Accurate mapping of land cover and land use at very high resolution (VHR) is crucial for studying urban development and human-environment interactions. Deep learning techniques, particularly semantic segmentation models, have emerged as powerful tools for this task. However, their widespread application is hindered by the substantial demand for annotated VHR datasets. Nonetheless, their effectiveness is often constrained by the extensive volume of labeled VHR imagery required for training.. Existing studies have mostly used low to medium-resolution imagery and fewer bands, resulting in limited downstream applicability. To our knowledge, this is the first attempt at studying urban areas in Canada at such spatial resolution using self-supervised deep learning techniques. The objective is to classify VHR multispectral imagery into eight urban land cover categories. The main challenges are preparing analysis-ready data, class imbalance, and a limited amount of labeled data. To address these challenges, we introduce an innovative deep learning framework designed to improve spectral-spatial consistency while leveraging the wealth of available unlabeled data for more effective learning and easily apply pre-trained representations to downstream tasks. We perform super-resolution using deep learning pansharpening, then latent feature extraction without labels and knowledge distillation using a small amount of labeled data. The proposed workflow is applied to Worldview 3 imagery over 80,000 patches of size 256x256 at 1m spatial resolution. The methodology was applied to two unet variants, a simple Unet and an attention-gated Unet with a Resnet50 encoder. The results show that while the simple Unet could not adequately capture the complexity of the data, unlike the complex model, self-supervised pre-training improves the overall accuracy(OA) of the prediction in both cases. For simple Unet, the accuracy was improved from 69% to 74%, and for complex unet, the OA improved from 80% to 88%. In conclusion, we display the effectiveness of multi-view self-supervised semantic segmentation on multispectral VHR images and create a land cover product for future research.
| Type de document: | Rapport |
|---|---|
| Mots-clés libres: | very high resolution; VHR; urban development; human-environment interactions; multispectral imagery; deep learning; semantic segmentation models |
| Centre: | Centre Eau Terre Environnement |
| Date de dépôt: | 07 janv. 2026 16:41 |
| Dernière modification: | 07 janv. 2026 16:41 |
| URI: | https://espace.inrs.ca/id/eprint/16731 |
Gestion Actions (Identification requise)
![]() |
Modifier la notice |

Statistiques de téléchargement
Statistiques de téléchargement