-
Notifications
You must be signed in to change notification settings - Fork 4
PreprocessTrainingData.py and runtraining.sh
Two main scripts to generate a trained model
PreprocessTrainingData.py /images /labels /augmented
Generates augmented data which is used for training
runtraining.sh --numiterations 50000 /augmented /trainednet
Performs the actual training (~1h per 1000 iterations of each model on a modern GPU).
PreprocessTrainingData.py
- a folder with sequential training images (.png or .tif files; they can be 8-bit or 16-bit)
- a folder with corresponding labels (8-bit images with zeros for background and 1 for object labels)
- the folder where to write the augmented data
runtraining.sh
- optinal: number of iterations for training (e.g. --numiterations 50000)
- the with the augmented data (from PreprocessTrainingData.py)
- output folder where the trained model is written
Augmentations: We group the augmentations into three types
- Primary augmentations: (= flipping, rotations) do not change the image, and are always performed.
- Secondary augmentations: alter noise, contrast, brightness of images
- Tertiary augmentations: alter image size, and object sizes
Secondary and tertiary augmentations can be controlled in their strength (secondary: -1 to 10 and tertiary: 0 to 10).
Example: PreprocessTrainingData.py /images /labels/ -1 3 /augmented
This means secondary augmentation = -1, which is a standard denoising built into CDeep3M2 tertiary augmentation = 3, which means moderate resizing operations are performed.
To use multiple training datasets, pass the folders sequentially into PreprocessTrainingData.py, and the last argument is the output path for the augmented dataset.
PreprocessTrainingData.py /images1 /labels1 /images2 /labels2 /augmented
You can define how secondary and tertiary augmentations are performed for euch training dataset individually.
E.G.
PreprocessTrainingData.py /images1 /labels1 -1 0 /images2 /labels2 2 5 /augmented
This will use the settings (secondary: -1 and tertiary: 0) augmentations for image set 1 and augmentation settings (secondary: 2 and tertiary: 5) for images set 2.
After the preprocessing the runprediction.sh is used (as normally) directly on the augmented folder.
runtraining.sh --numiterations 50000 /augmented /trainednet
ls /trainednet
1fm 3fm 5fm VERSION parallel.jobs readme.txt train_file.txt valid_file.txt
ls /trainednet/1fm
deploy.prototxt label_class_selection.prototxt log solver.prototxt train_val.prototxt trainedmodel
trainedmodel 1fm_classifer_iter_60000.caffemodel 1fm_classifer_iter_60000.solverstate
Snapshots (usually every 2000 iterations) of the trained model, to perform predictions and to continue training from this solverstate.
log accuracy.pdf loss.pdf out.log out.log.test out.log.train
Log files tracking the loss and accuracy over the training iterations. To generate accuracy.pdf and loss.pdf, run PlotValidation.py with the log folder as input argument