Skip to content

saif8091/yield-multi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

21 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Multistage Table Beets yield prediction from unmanned aerial systems

Python Version

πŸ› οΈInstallation

Prerequisites

  • Python 3.11
  • Conda

Steps

  • Clone/download the repository
  • Set up and activate the environment
  • Download the dataset in the directory
# Downloading the directory
git clone git@github.com:saif8091/yield-multi.git
cd yield-multi

# Setting up environment
conda env create -f environment.yml
conda activate yield_multi

Note: The data should be downloaded and placed as data directory in the project root. Look here for detailed directory structure.

πŸ”„Preprocessing and Vegetation extraction

Run the following code:

python make.py

This code zips the images, performs preprocessing (HSI only) and then extracts the vegetation from the croped plot images.

HSI Preprocessing

  • Spectral downsampling by averaging every 3 adjacent bands
  • Savitsky Golay filter
  • Cut off extremities

πŸ“Š Feature generation

python gen_feat.py

This code generates different types of features of each plot and compiles them in a csv file and can be found here.

Feature Abbreviations and Definitions

  • gdd: Growing Degree Days
  • evap: Accumulated Evapotranspiration
  • vol: Volume (obtained from SFM)
  • vol_lidar: Volume (obtained from LiDAR)
  • y: Beet root yield ($kg/m^2$)

Naming Conventions

Mean Reflectance Spectra

  • Format: ref_x_mean_n
    • x: h for hyperspectral, m for multispectral
    • n: Index number (0-79 for HSI, corresponding to wavelengths between 400.80 to 928.35 nm)

Spectral Decomposition

  • Format: xxx_h_n
    • xxx: Decomposition method (pca, fa, ica, plsr)
    • h: Indicates hyperspectral data
    • n: Band number
    • Note: 3-component decomposition is used as it explains 95% of the variance.

Vegetation Indices Spectra

  • Format: vi_xxx_h_mean
    • xxx: Vegetation index
    • h: Indicates hyperspectral data, m for multispectral data

Notes

  • Both spectral decomposition and vegetation indices are calculated from the mean plot reflectance spectra.

Feature filtering

python -m feature_filter.filter.py

This is code is run to find the relevant spectral features. The filtered feature can be found here.

πŸ§ͺ Machine Learning Model Tests

Model schematic:

Schematic

Gaussian Process Regression (GPR)

python -m model_tests.gpr_score_22

Trains and outputs GPR model scores at different feature combinations for the 2022 dataset. The scores can be found here.

Random Forest (RF)

python -m model_tests.rf_score_22

Trains and evaluates Random Forest models with hyperparameter tuning.

XGBoost

python -m model_tests.xgb_score_22

Trains and evaluates XGBoost models with hyperparameter tuning.

Support Vector Regression (SVR)

python -m model_tests.svr_score_22

Trains and evaluates Support Vector Regression models with hyperparameter tuning.

Partial Least Squares Regression (PLSR)

python -m model_tests.plsr_score_22

Trains and evaluates PLSR models with hyperparameter tuning.

Model Transferability Testing

To test for transferability to 2021 test set run the following code:

python -m model_tests.gpr_testing_21

To visualise the result open this notebook.

πŸ“ˆ Model Performance and Interpretation

Open model_performance.

πŸ“ Project Structure

yield-multi/
β”œβ”€β”€ data/                                # Main data directory
β”‚   β”œβ”€β”€ 2021_data.xlsx                   # Field measurements from 2021 growing season
β”‚   β”œβ”€β”€ 2022_data.xlsx                   # Field measurements from 2022 growing season
β”‚   β”œβ”€β”€ all_weather_data.csv             # Meteorological data from weather station
β”‚   β”œβ”€β”€ hyper/                           # Hyperspectral imagery
β”‚   β”‚   β”œβ”€β”€ 2021/                        # 2021 growing season data
β”‚   β”‚   β”‚   └── YYYYMMDD/                # Date-organized folders
β”‚   β”‚   β”‚       └── x_YYYYMMDD.tif       # Plot images (x = plot number)
β”‚   β”‚   └── 2022/                        # 2022 growing season data
β”‚   β”‚       └── YYYYMMDD/                # Date-organized folders
β”‚   β”‚           β”œβ”€β”€ disease_grid_yield_2022/  # LBRN12Disease location
β”‚   β”‚           β”œβ”€β”€ lovebeets_grid_2022/      # LBRN12EAST location
β”‚   β”‚           └── UV_efficacy_2022/         # LBRN12WEST location
β”‚   β”‚               └── x_YYYYMMDD.tif   # Plot images
β”‚   β”œβ”€β”€ multi/                           # Multispectral imagery (same structure as hyper/)
β”‚   β”œβ”€β”€ structure/                       # Canopy height models
β”‚   β”‚   β”œβ”€β”€ chm/                         # CHMs from structure from motion (same structure as hyper/)
β”‚   β”‚   └── chm_lidar/                   # CHMs from LiDAR (same structure as hyper/)
β”‚   β”œβ”€β”€ preprocessed/                    # Processed data files
|   |   β”œβ”€β”€ decomposer/                  # Directory for storing decomposition models
β”‚   β”‚   β”œβ”€β”€ features_21_22.csv           # Compiled features from plots
β”‚   β”‚   └── various .pkl files           # Pickled data objects
β”‚   └── ReadMe.md                        # Dataset documentation
β”œβ”€β”€ data_load/
|   β”œβ”€β”€ all_data_load.py                 # Script for loading all types of data
|   β”œβ”€β”€ gt_data_load.py                  # Script for loading ground truth data
|   └── wt_data_load.py                  # Script for loading weather data
β”œβ”€β”€ feature_filter/
β”‚   β”œβ”€β”€ filtered_features/               # Directory containing filtered features                      
β”‚   β”œβ”€β”€ feat_filter_cfs.py               # feature filtering through correlation
β”‚   β”œβ”€β”€ feat_filter_mfs.py               # feature filtering through mutual information
β”‚   β”œβ”€β”€ feat_filter_micorfs.py           # feature filtering using combination correlation and mutual information
β”‚   β”œβ”€β”€ feature_selection.py             # feature selection function
|   └── filter.py                        # Implementation of feature filtering
β”œβ”€β”€ feature_formation/                   
|   β”œβ”€β”€ feat_split_ratio.py              # Contains the ratio for spliting features
|   β”œβ”€β”€ feat_split.py                    # Script for spliting the dataset into train and test
β”‚   └── hsi_decomposer.py                # Functions for hyperspectral image decomposition
β”œβ”€β”€ figures/                             # Figures and visualizations
β”‚   └── model_schematic.jpg              # Schematic diagram of the model
β”œβ”€β”€ model_files/                         # Model files directory
β”‚   β”œβ”€β”€ gpr_model_func.py                # GPR model functions
β”‚   β”œβ”€β”€ rf_model_func.py                 # Random Forest model functions
β”‚   β”œβ”€β”€ xgb_model_func.py                # XGBoost model functions
β”‚   β”œβ”€β”€ svr_model_func.py                # SVR model functions
β”‚   β”œβ”€β”€ plsr_model_func.py               # PLSR model functions
β”‚   β”œβ”€β”€ mlp_model_func.py                # MLP model functions
β”‚   β”œβ”€β”€ load_feats.py                    # Feature loading utilities
β”‚   └── model_scores/                    # Directory containing model score outputs
β”œβ”€β”€ model_tests/                         # Model testing scripts
β”‚   β”œβ”€β”€ gpr_score_22.py                  # GPR model scoring for 2022 data
β”‚   β”œβ”€β”€ rf_score_22.py                   # Random Forest model scoring for 2022 data
β”‚   β”œβ”€β”€ xgb_score_22.py                  # XGBoost model scoring for 2022 data
β”‚   β”œβ”€β”€ svr_score_22.py                  # SVR model scoring for 2022 data
β”‚   β”œβ”€β”€ plsr_score_22.py                 # PLSR model scoring for 2022 data
β”‚   β”œβ”€β”€ mlp_score_22.py                  # MLP model scoring for 2022 data
β”‚   β”œβ”€β”€ gpr_testing_21.py                # GPR model testing for transferability to 2021
β”‚   β”œβ”€β”€ visualising_performance.ipynb    # Notebook for visualizing model performance
β”‚   └── model_schematic.jpg              # Visual representation of the model
β”œβ”€β”€ preprocess/                          # Preprocessing scripts
β”‚   └── zip_im.py                        # Script for zipping images into dictionaries
β”œβ”€β”€ src/                                 # Source code directory
β”‚   β”œβ”€β”€ utils.py                         # Utility functions for data processing
β”‚   └── misc.py                          # Miscellaneous helper functions
β”œβ”€β”€ environment.yml                      # Conda environment definition file
β”œβ”€β”€ make.py                              # Script for preprocessing and vegetation extraction
β”œβ”€β”€ gen_feat.py                          # Script for generating features
β”œβ”€β”€ model_performance.ipynb              # Notebook for model performance analysis
└── README.md                            # Project documentation

About

Code for table beets multistage yield prediction

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published