✅ Features

🌍 Land Use/Land Cover (LULC) Classification using Random Forest

Python | Rasterio | Scikit-Learn | GeoPandas

This project performs a supervised Land Use/Land Cover (LULC) classification using a Random Forest machine learning model applied to multispectral Landsat imagery. Digital Elevation Model (DEM) and slope layers are incorporated as auxiliary predictor variables to improve classification accuracy.

The workflow extracts pixel values under point-based training samples, builds a classification model, evaluates its performance, and generates a classified LULC raster. This methodology is well suited for applications in:

Remote sensing
Environmental and ecological mapping
Hydrological studies
Urban and regional planning
Agricultural monitoring

✅ Features

Load multispectral imagery (Landsat)
Integrate DEM and slope layers as additional predictors
Extract training data from point-based shapefile
Train a Random Forest classifier
Perform train/test accuracy assessment
Predict LULC for the full study area
Export a classified GeoTIFF raster
Visualize the final LULC map

📦 Prerequisites

✔ Required Python Libraries

pip install geopandas rasterio numpy pandas scikit-learn matplotlib

📁 Input Data Requirements

All input datasets must follow these conditions:

1. Coordinate Reference System (CRS)

The following must use the same CRS:

Landsat composite raster
DEM raster
Slope raster
Training point shapefile

CRS mismatches will result in incorrect pixel extraction.

2. Spatial Resolution Requirements

DEM and slope layers must be:

Resampled to match the Landsat pixel resolution (~30 m)
Aligned so that all rasters share the same grid structure

3. Spatial Extent

DEM and slope must be:

Clipped to the Landsat extent
Having matching rows, columns, and geotransform

This ensures all layers stack correctly.

🎯 Training Sample Requirements (Point-Based Sampling)

The classification workflow uses point training samples, where each point represents a known land-cover class.

✔ Training Data Format

Shapefile (.shp)
Geometry: POINT
Attribute field:
- Class → integer code representing the land-cover class
CRS identical to all rasters

Example Attribute Table

Point_ID	Class	Geometry
1	1 (Water)	POINT(x, y)
2	3 (Vegetation)	POINT(x, y)
3	5 (Barren)	POINT(x, y)

Each point is used to extract:

Band1, Band2, Band3, Band4, Band5, Band6, Band7, DEM, Slope

Resulting Training Table Format

Class	Band1	Band2	Band3	Band4	Band5	Band6	Band7	DEM	Slope
Barren	123	98	76	45	23	12	5	255	14
Vegetation	45	63	88	123	145	110	95	300	9
Water	10	20	25	30	15	10	8	201	2

This feature matrix is built automatically by the script.

🧭 Why Use Point Samples?

Avoids boundary errors from polygons
Ideal for pixel-based machine learning workflows
Ensures exact spatial correspondence between training data and raster grid
Faster and more memory-efficient
Works cleanly with Rasterio and NumPy

📁 Recommended Project Structure

project/
├── data/
│   ├── landsat_composite.tif
│   ├── dem_resampled.tif
│   ├── slope_resampled.tif
│   ├── training_samples.shp
│   └── ...
├── src/
│   └── classify_lulc.py
└── README.md

▶️ Running the Classification Script

1. Update file paths:

landsat_path = '/path/to/landsat.tif'
train_shp_path = '/path/to/training_points.shp'
dem_path = '/path/to/dem_resampled.tif'
slope_path = '/path/to/slope_resampled.tif'

2. Execute the script:

python classify_lulc.py

3. Output Files Generated:

LULC_2016.tif → Classified raster
Accuracy metrics (precision, recall, F1-score)
Matplotlib visualization of classification

📊 Output Example

GeoTIFF classified map saved to disk
Displayed map using a color-coded scheme
Accuracy evaluation for model validation

🧠 Why Random Forest?

Random Forest is a powerful classifier for remote sensing because:

Handles nonlinear relationships
Works well with high-dimensional data
Robust against noise
Requires no assumptions about data distribution
Provides high accuracy for mixed Land Cover data

💡 Potential Extensions

Add vegetation/water indices (NDVI, NDWI, SAVI)
Add terrain-based predictors (TWI, TRI, HAND)
Add GLCM texture metrics
Tune model using GridSearchCV
Generate probability/confidence maps
Add lineament density as a predictor
Perform temporal LULC change analysis

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
lulc_2016_ml_mask.py		lulc_2016_ml_mask.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Python | Rasterio | Scikit-Learn | GeoPandas

✅ Features

📦 Prerequisites

✔ Required Python Libraries

📁 Input Data Requirements

1. Coordinate Reference System (CRS)

2. Spatial Resolution Requirements

3. Spatial Extent

🎯 Training Sample Requirements (Point-Based Sampling)

✔ Training Data Format

Example Attribute Table

Resulting Training Table Format

🧭 Why Use Point Samples?

📁 Recommended Project Structure

▶️ Running the Classification Script

1. Update file paths:

2. Execute the script:

3. Output Files Generated:

📊 Output Example

🧠 Why Random Forest?

💡 Potential Extensions

About

Uh oh!

Releases

Packages

Languages

jeevanmp99/LULC-Random-forest

Folders and files

Latest commit

History

Repository files navigation

Python | Rasterio | Scikit-Learn | GeoPandas

✅ Features

📦 Prerequisites

✔ Required Python Libraries

📁 Input Data Requirements

1. Coordinate Reference System (CRS)

2. Spatial Resolution Requirements

3. Spatial Extent

🎯 Training Sample Requirements (Point-Based Sampling)

✔ Training Data Format

Example Attribute Table

Resulting Training Table Format

🧭 Why Use Point Samples?

📁 Recommended Project Structure

▶️ Running the Classification Script

1. Update file paths:

2. Execute the script:

3. Output Files Generated:

📊 Output Example

🧠 Why Random Forest?

💡 Potential Extensions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages