Potato Belt is an end-to-end Computer Vision project designed to automate the sorting process of potatoes in industrial settings. Using Deep Learning (MobileNetV2), the system classifies potatoes into three categories with up to 99.29% test accuracy:
- Healthy: Fit for consumption.
- Rotten: Fungal defect, must be discarded.
- Green: Contains toxic solanine, strictly prohibited for consumption.
The project includes a Real-time Simulation (Pygame) connected to a Flask API serving the trained model, simulating a conveyor belt sorting mechanism.
- Deep Learning Model: Transfer Learning using MobileNetV2 pre-trained on ImageNet.
- High Accuracy: Achieved 99.29% accuracy, significantly outperforming traditional ML methods.
- Data Augmentation: Handled severe class imbalance (specifically "Green" class) using rotation and flipping.
- Real-Time Simulation: A conveyor belt simulation built with Pygame to visualize the sorting process.
- API Integration: A Python Flask backend processes images and returns predictions to the simulation.
- Language: Python 3.x
- Deep Learning: TensorFlow, Keras (MobileNetV2)
- Computer Vision: OpenCV, Pillow (PIL)
- Backend: Flask (REST API)
- Simulation: Pygame
- Data Analysis: Pandas, NumPy, Matplotlib, Scikit-learn
git clone [https://github.com/algobana/patates_project.git](https://github.com/algobana/patates_project.git)
cd patates_projectpip install -r requirements.txtNote: A trained model file must be provided before running the backend service.
The Flask server loads the trained model and listens for image requests.
cd backend
python app.pyServer will start at http://localhost:5000
Open a new terminal window (keep the backend running) and start the Pygame simulation.
cd simulation
python sim_to_api.pyThe dataset was aggregated from Roboflow Universe and custom-collected samples.
Data augmentation was applied only on the training set to prevent data leakage.
- Total Images: ~3,800 (Balanced via augmentation).
- Classes: Green (~1,600), Healthy (~1,000), Rotten (~1,200).
- Preprocessing:
- Resized to 160x160 pixels.
- Normalized pixel intensities to [-1, 1] (MobileNetV2 standard).
- One-Hot Encoding for target labels.
The dataset includes open-source samples from Roboflow Universe and custom-collected images, used in compliance with their respective licenses.
Three algorithms were benchmarked. The CNN model (MobileNetV2) demonstrated superior performance.
| Algorithm | Accuracy | Precision | Recall | F1-Score | ROC-AUC |
|---|---|---|---|---|---|
| Decision Tree | 0.7665 | 0.7705 | 0.7665 | 0.7674 | 0.77 |
| Naive Bayes | 0.6417 | 0.6607 | 0.6417 | 0.6442 | 0.72 |
| CNN (MobileNetV2) | 0.9929 | 0.9929 | 0.9929 | 0.9929 | 1.00 |
The model shows near-perfect classification with negligible error.
Berk AŞCI
- Date: December 2025
This project is open-source and available under the MIT License.


