Neural Network mucking about.
This is an attempt to create a framework for creating NNs for various purposes.
Create a virtual environment and activate it:
python3 -m venv venv
sounce venv/bin/activate
Install the requirements into the virtual environment.
pip install -r requirements.txt
Note: if new packages are added the requirements.txt can be updated with:
pip freeze > requirements.txt
Note: see the sample_data.csv entry in [Output files](Output files:) for a python one-liner to create a random sample file.
- Train with 64 hidden units (default)
python train_model.py --data data/sample_data.csv
- Train with 128 hidden units
python train_model.py --data data/sample_data.csv --hidden-size 128
- Train with custom hyperparameters
python train_model.py \
--data data/sample_data.csv \
--hidden-size 128 \
--num-layers 3 \
--batch-size 64 \
--learning-rate 0.0005
Note: see the test_samples.csv entry in [Output files](Output files:) for a python one-liner to create a test sample file from the training set.
- Predict a few samples from the training set:
python predict.py --data data/test_samples.csv
- Predict a single random input sample:
python test_predict_row.py
-
config.pyExternalizes many of the hyperparameters for a Neural Network and various files that get generated. -
data/class_labels.txtis a list of the labels for the output classifications. The NN classifies into numeric indices which are translated to the labels specified in this file. EG index 0 = first label in the list, index 1 = second label, etc.
-
data/best_model.pthanddata/best_model_config.jsonare both input and output files They are created when the model is trained and the model is used when predictions are made. the JSON file documents the hyperparameters used to create the model during training. -
data/confusion_matrix*various confusion matrix outputs. These give an idea of how well the model performs. -
data/sample_data.csvandtest_sample.csvdummy random sample data created to test the framework.- the sample data is created with
python -c "from dataset import generate_random_csv; generate_random_csv('data/sample_data.csv', 1500)" - test sample is created with
python -c "import pandas as pd; df = pd.read_csv('data/sample_data.csv'); df.tail(5).to_csv('data/test_samples.csv', index=False)"
- the sample data is created with
-
data/classification_report.txtrecords a summary of how well the model performs it includes: the precision, recall, F1-score, and support. -
data/per_class_metrics.txtrecords the accuracy of each classification. -
data/top_misclassifications.txtrecords a deeper look at the top 10 misclassifications. -
data/best_model.onnxONNX export of the best model- can be created with
python -c "from utils import export_to_onnx; export_to_onnx('data/best_model')"
- can be created with
C++ code to use the model created and exported to ONNX format.
These are just playground files, tinkering with NNs in various ways.