The "Plant Classifier" project aims to classify agricultural plants into different categories based on their dimensional and shape factors. The dataset contains various features that describe the physical characteristics of the plants, and the goal is to predict the class or category a given plant belongs to.
- Data Description
- Setup and Installation
- Data Preprocessing
- Model Training and Evaluation
- Hyperparameter Tuning
- Results
- Future Work
- Contact
The dataset contains the following features:
- Area
- Perimeter
- MajorAxisLength
- DFactor1 to DFactor9
- ShapeFactor1 to ShapeFactor4
- Class (Target Variable)
The target variable, Class, contains categories like "BA", "BO", "CA", "DE", "HO", "SE", and "SI".
- Clone the repository to your local machine.
- Install the required libraries using
pip install -r requirements.txt. - Run the Jupyter Notebook to execute the project.
The data underwent several preprocessing steps:
- Handling missing values
- Feature scaling using
StandardScaler - Encoding categorical variables using
LabelEncoder
Several classification algorithms were applied to the preprocessed data:
- Random Forest Classifier
- Decision Tree Classifier
- Support Vector Machine Classifier
- k-Nearest Neighbors Classifier
- Gradient Boosting Classifier
Each model's performance was evaluated using accuracy, precision, recall, F1 score, and ROC AUC.
Hyperparameter tuning was performed for the ExtraTreesClassifier using GridSearchCV. The best parameters were selected based on cross-validation results.
The Support Vector Machine (SVM) classifier achieved the highest accuracy among all the models. The confusion matrix and classification report provided detailed insights into the model's performance for each class.
- Explore other classification algorithms and ensemble methods.
- Implement feature engineering to improve model performance.
- Deploy the model as a web application for real-time plant classification.
For any queries or feedback, please reach out to:
- Email: rk.pani2002@gmail.com