Second work of UB course "Introduction to Machine Learning" implementing classification with Lazy Learning and SVM
Eva Veli, Andras Kasa and Niklas Long Schiefelbein
- PyCharm IDE (Professional or Community Edition)
- Python 3.9 installed on your system
-
Open the project
work2in PyCharm -
Open the terminal in PyCharm (View > Tool Windows > Terminal)
-
Optional: Verify current location being
work2bypwd -
Optional: Navigate to
work2withcd -
Create a virtual environment:
# Windows py -3.9 -m venv venv # macOS/Linux python3.9 -m venv venv
-
Activate the virtual environment:
# Windows venv\Scripts\activate # macOS/Linux source venv/bin/activate
In front of the input line in the terminal it should now say (venv)
With the virtual environment activated:
pip install -r requirements.txtFrom here you can directly jump to Run app.py
With the virtual environment activated:
deactivateThe (venv) in front of the terminal should be gone
For this, just follow the optional steps 3 and 4 from the Manual Virtual Environment Setup
# Windows
venv\Scripts\activate
# macOS/Linux
source venv/bin/activateIn front of the input line in the terminal it should now say (venv)
python app.pyThe first execution takes more time than usual due to the initial compilation of the whole project. Once compiled, it prompts the user to provide an input. The user must decide whether to use the hepatitis or the pen-based dataset for the analysis. By simply pressing enter, the hepatitis dataset will be selected by default.
Now the entire project pipeline will execute, including data preprocessing, KNN and SVM analyses, various reduction techniques, and final report generation. Progress is displayed in the console, but due to frequent calculations and multithreading, following along in real-time may be difficult. It is recommended to refer to the final reports for evaluation. The program completes once the nemenyi test report is generated.
For deeper insights please consider reading the report of the project.
work2/
├── classifiers/ # SVM and KNN classifiers
├── csv-results/ # Performance metrics and results
├── datasetsCBR/ # Dataset files
├── metrics/ # Performance metric calculations
├── preprocessing/ # Data preprocessing scripts
├── reduction_techniques/ # Instance reduction algorithms
├── reporting/ # Reporting and analysis scripts
├── reports/ # Generated reports
├── venv/ # Virtual environment
├── app.py # Main application script
├── README.md # This file
├── requirements.txt # Dependencies
└── utils.py # Utility functions