Peatland Degradation Classification
Description
This project consists of a bespoke peatland imagery dataset covering the moorland areas of the Yorkshire Dales. Peatlands are vital and increasingly rare ecosystems in the UK; they support significant biodiversity while providing essential ecosystem services, including carbon storage and water retention to prevent flooding.
This project aims to map peatland quality, and by extension, the biodiversity and ecosystem services it supports—across the Yorkshire Dales using satellite imagery gathered via Google Earth Engine (GEE). I experimented with various CNN architectures for image classification and found that ResNet-18 delivered the best performance. While larger models may offer further improvements, ResNet-18 proved the most efficient for local processing.
Methodology Outline
Grid Generation: Created a vector tile grid for the Yorkshire Dales in QGIS.
Data Filtering: Generated a multipolygon covering the extent of Yorkshire Dales peatland using a peaty soils vector dataset. Vector tiles were only retained where peaty soil covered at least 80% of the grid cell.
Imagery Acquisition: Generated a GEE Summer Image of the Yorkshire Dales, composed of multiple averaged images from Summer 2023 with less than 10% cloud coverage.
Spectral Analysis: Calculated spectral indices for the summer imagery, including NDMI, TCW, and NDVI.
Scoring: Calculated average NDMI, TCW, and NDVI for every grid cell. Scores were assigned to each based on established literature regarding degradation levels.
Labelling: Calculated overall grid scores to determine peatland degradation labels for each cell.
Dataset Preparation: Extracted tile images with labels in the filenames, then split the data into training, validation, and testing subsets.
Preprocessing: Built pipelines for transforming and normalising imagery for model input.
Model Iteration: Experimented with a variety of CNN architectures to find the optimal classifier.
Features Dataset: ~3,000 labelled Sentinel-2 images covering the Yorkshire Dales National Park.
Automated Workflow: Functionality for creating the image dataset and pipelines for model input.
Architectures: PyTorch-based CNNs and ResNet-18 models.
Evaluation: Full training, validation, and testing functionality.
Motivations
Data Scarcity: There is currently no existing public dataset for this specific purpose in this region.
Personal Connection: As an avid cyclist and hiker in the Dales, I have witnessed peatland burning and wildfires firsthand. I wanted to explore the environmental impact on a region I know well.
Skill Development: I aimed to learn the end-to-end process of scraping and building an image classification dataset.
Deep Learning: I wanted to gain proficiency in PyTorch. Unlike higher-level APIs like Keras, PyTorch requires a deeper understanding of model architectures and training loops, which provided a more robust learning experience.
What Problem Does It Solve?
Monitoring: Identifying patterns and the extent of peatland degradation in the Yorkshire Dales.
Feasibility: Testing the efficacy of CNNs in classifying peatland health using Sentinel-2 imagery.
Lessons Learned GEE Integration: Efficiently extracting high-resolution images from Google Earth Engine.
Data Challenges: Navigating the difficulties of labelling image datasets. I performed multiple iterations to refine classifications that the model could accurately interpret, specifically addressing misclassifications caused by initial variable selection.
PyTorch Proficiency: Building custom training and validation loops from scratch.
Optimisation: Applying transfer learning and fine-tuning techniques to improve model accuracy.