first of all, congratulations for creating this fantastic tool, used it may time out of the box and it worked flawlessly.
However, currently I am trying to fine-tune (or train) the model to help me with detection of synapses (BRP labeled) and the results are really poor (model is not learning anything, see bellow). At this point I have tried so many different things/parameter setting and it's really hard to keep track of everything. Do you have tips on what to try to make the model actually learn? My data is of course very imbalanced (as all spot detection problems would be I guess). Training images are of varying sizes (typically something like 100x200x200). Data was originally unisotropic ([0.17 0.17 0.5]), but has been resampled to isotropic ([0.17 0.17 0.17]), could this be the issue?
INFO:spotiflow.model.spotiflow:Training config is: SpotiflowTrainingConfig(
batch_size=32
crop_size=64
crop_size_depth=32
early_stopping_patience=0
finetuned_from=synth_3d
flow_loss_f=l1
heatmap_loss_f=bce
loss_levels=None
lr=0.0003
lr_reduce_patience=10
num_epochs=100
num_train_samples=None
optimize_threshold=True
optimizer=adamw
pos_weight=10
smart_crop=True
)
Normalizing images: 100%|██████████| 270/270 [00:16<00:00, 16.03it/s]WARNING:spotiflow.data.spots:Some images are smaller than the crop size ((32, 32, 32)). Will center pad with zeros.
Normalizing images: 100%|██████████| 68/68 [00:04<00:00, 16.81it/s]WARNING:spotiflow.data.spots:Some images are smaller than the crop size ((32, 32, 32)). Will center pad with zeros.
WARNING:spotiflow.model.spotiflow:Deterministic training is currently not supported in 3D mode. Disabling.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
You are using a CUDA device ('NVIDIA GeForce RTX 3080') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:spotiflow.model.trainer:Creating logdir models/3d_custom_individual_neurons and saving training config...
| Name | Type | Params | Mode
------------------------------------------------------
0 | model | Spotiflow | 35.5 M | train
1 | _flow_loss_func | L1Loss | 0 | train
------------------------------------------------------
35.5 M Trainable params
0 Non-trainable params
35.5 M Total params
141.960 Total estimated model params size (MB)
325 Modules in train mode
0 Modules in eval mode
Epoch 0: 100%|██████████| 9/9 [00:02<00:00, 3.65it/s, v_num=0] INFO:spotiflow.model.trainer:Saved best model with val_loss=0.463.
Epoch 1: 100%|██████████| 9/9 [00:04<00:00, 1.96it/s, v_num=0, val_loss=0.463, val_f1=0.110, val_acc=0.101, heatmap_loss=0.304, flow_loss=0.194, train_loss=0.498]INFO:spotiflow.model.trainer:Saved best model with val_loss=0.423.
Epoch 3: 100%|██████████| 9/9 [00:05<00:00, 1.74it/s, v_num=0, val_loss=0.443, val_f1=0.0525, val_acc=0.050, heatmap_loss=0.289, flow_loss=0.180, train_loss=0.469]INFO:spotiflow.model.trainer:Saved best model with val_loss=0.396.
Epoch 9: 100%|██████████| 9/9 [00:03<00:00, 2.62it/s, v_num=0, val_loss=0.398, val_f1=0.170, val_acc=0.168, heatmap_loss=0.256, flow_loss=0.160, train_loss=0.416] INFO:spotiflow.model.trainer:Saved best model with val_loss=0.365.
Epoch 37: 100%|██████████| 9/9 [00:03<00:00, 2.54it/s, v_num=0, val_loss=0.407, val_f1=0.132, val_acc=0.132, heatmap_loss=0.262, flow_loss=0.159, train_loss=0.421] INFO:spotiflow.model.trainer:Saved best model with val_loss=0.343.
Epoch 63: 100%|██████████| 9/9 [00:03<00:00, 2.49it/s, v_num=0, val_loss=0.443, val_f1=0.103, val_acc=0.103, heatmap_loss=0.263, flow_loss=0.162, train_loss=0.425] INFO:spotiflow.model.trainer:Saved best model with val_loss=0.326.
Epoch 99: 100%|██████████| 9/9 [00:09<00:00, 0.99it/s, v_num=0, val_loss=0.395, val_f1=0.191, val_acc=0.191, heatmap_loss=0.259, flow_loss=0.157, train_loss=0.416] `Trainer.fit` stopped: `max_epochs=100` reached.
INFO:spotiflow.model.spotiflow:Will use device: cuda:0
optimizing threshold: 100%|██████████| 11/11 [00:00<00:00, 106.54it/s]
optimizing threshold: 100%|██████████| 11/11 [00:00<00:00, 118.41it/s]INFO:spotiflow.model.spotiflow:Best thresholds: (np.float64(0.42),)
INFO:spotiflow.model.spotiflow:Best F1-score: (np.float64(0.0),)
INFO:spotiflow.model.trainer:Saved last model with optimized thresholds.
INFO:spotiflow.model.spotiflow:Will use device: cuda:0
optimizing threshold: 100%|██████████| 11/11 [00:00<00:00, 138.77it/s]
optimizing threshold: 100%|██████████| 11/11 [00:00<00:00, 148.77it/s]INFO:spotiflow.model.spotiflow:Best thresholds: (np.float64(0.452),)
INFO:spotiflow.model.spotiflow:Best F1-score: (np.float64(0.0),)
INFO:spotiflow.model.trainer:Saved best model with optimized thresholds.
Epoch 99: 100%|██████████| 9/9 [00:16<00:00, 0.56it/s, v_num=0, val_loss=0.395, val_f1=0.191, val_acc=0.191, heatmap_loss=0.259, flow_loss=0.157, train_loss=0.416]
INFO:spotiflow.model.spotiflow:Training finished.
I am looking for any hints on how to improve my training. I can also gladly provide some training examples to get a better impression for the data.
Dear creators of Spotiflow,
first of all, congratulations for creating this fantastic tool, used it may time out of the box and it worked flawlessly.
However, currently I am trying to fine-tune (or train) the model to help me with detection of synapses (BRP labeled) and the results are really poor (model is not learning anything, see bellow). At this point I have tried so many different things/parameter setting and it's really hard to keep track of everything. Do you have tips on what to try to make the model actually learn? My data is of course very imbalanced (as all spot detection problems would be I guess). Training images are of varying sizes (typically something like 100x200x200). Data was originally unisotropic ([0.17 0.17 0.5]), but has been resampled to isotropic ([0.17 0.17 0.17]), could this be the issue?
Training log:
I am looking for any hints on how to improve my training. I can also gladly provide some training examples to get a better impression for the data.
Thank you very much and kind regards,
Blaž