This project uses the UCI Wine dataset to classify wine into one of three cultivars using machine learning techniques. The entire workflow includes data preprocessing, exploratory data analysis (EDA), model training, hyperparameter tuning, and evaluation.
- Source: UCI Machine Learning Repository
- Instances: 178
- Features: 13 chemical attributes of wines
- Target Variable: Wine Class (1, 2, or 3)
- Pairplots were generated for selected features.
- Summary statistics confirmed no missing values or anomalies.
- Class distributions were balanced using stratified splits.
- StandardScaler was used to normalize the features.
- 80/20 train-test split was used with stratification on class.
- Logistic Regression: GridSearchCV tuned the regularization strength
C. Best C = 0.01 - SVM (Linear Kernel): Also tuned using GridSearchCV. Best C = 0.01
- Test Accuracy: 100% for both models.
- Precision, Recall, and F1-score were all perfect (1.00) across all classes.
- PCA visualization showed clear class separability.
Both Logistic Regression and SVM performed excellently on this dataset, achieving perfect classification metrics. This is likely due to well-separated classes in the original feature space. PCA confirmed this visually.
wine_classification.ipynb: Full code notebookwine.data.csv: Dataset usedREADME.md: This summaryrequirements.txt: Python package requirements