-
Notifications
You must be signed in to change notification settings - Fork 14
Description
Current output for Random Forest Confusion Matrix graph using ML_classification.py displays an additional set of negative ("0") and positive ("1") on the predicted axis of the image -- attached.
base2_model_cofunction.csv_mod.txt_RF_CM.pdf
Also, in the RF_results output file, the Mean Balanced Confusion Matrix also shows the extra columns for another negatice and positive class.
Mean Balanced Confusion Matrix:
Class 0 1 0 1
0.0 4542.62 2905.38
1.0 4119.38 3328.62
The expected output should only have two categories for each axis display a total of 4 regions in the graph to illustrate: True negatives, False Positives, True Positives, and False Negatives.
A subset of data can be used to run the pipeline to show display the graph format.
subset_matrixdata_CM_GitIssue.csv
Incorrect graph format can be recreated using provided subset data and code below:
- ML_preprocess.py:
python /ML-Pipeline/ML_preprocess.py -df subset_matrixdata_CM_GitIssue.csv -sep ',' -onehot f test_set.py: python /ML-Pipeline/test_set.py -df subset_matrixdata_CM_GitIssue.csv_mod.txt -sep ',' -type c -p 0.1 -save <test_set_file.txt>- ML_classification.py:
python /ML-Pipeline/ML_classification.py -df subset_matrixdata_CM_GitIssue.csv_mod.txt -sep ',' -test <test_set_file.txt> -cl_train 1,0 -alg RF -cm t -plots t -n_jobs 8