-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Hello. respective researcher.
I've been studying your research work.
In your paper, all conditions are presented except the number of epochs you trained.
Therefore, I'd like to know how many epochs you trained.
I have one more question.
How do I fine tune your model?
I ran the following command you specified to fine tune the model:
python3 -m torch.distributed.launch
--nproc_per_node=8
scripts/train_net.py
--config-file "experiments/hcstvg2.yaml"
INPUT.RESOLUTION 420
OUTPUT_DIR output/hcstvg2
TENSORBOARD_DIR output/hcstvg2
MODEL.WEIGHT [Pretrained Model Weights]
However, MODEL.WEIGHT does not seem to work to fine tune.
After some iteration, it started to evaluate model.
However, the validation score becomes very low.
gt_viou: 0.2623
tiou: 0.2723
viou: 0.0727
viou@0.3: 0.0300
gt_viou@0.3: 0.3900
viou@0.5: 0.0015
gt_viou@0.5: 0.0860
Therefore, I think the code did not use the provided weight, but trained the model entirely from scratch (randomly initialized weight).
Thank you very much for spending your time for me.
The train log is as follows:
2025-11-07 13:04:39,566 Video Grounding INFO: Loading checkpoint from model_zoo/hcstvg2.pth
2025-11-07 13:04:42,963 Video Grounding INFO: Start training
2025-11-07 13:05:32,766 Video Grounding INFO: eta: 10 days, 9:15:35 iter: 10550 / 911610 loss: 10.0899 (10.3950) loss_bbox: 2.3663 (2.5455) loss_giou: 3.2242 (3.1579) loss_sted: 2.6314 (3.1345) loss_conf: 0.7778 (0.7998) loss_actioness: 0.6677 (0.7573) time: 1.1195 (0.9959) data: 0.0115 (0.0545) lr: 0.000300 lr_vis_encoder: 0.000020 lr_text_encoder: 0.000050 lr_temp_decoder: 0.000100 max mem: 52950
2025-11-07 13:06:20,774 Video Grounding INFO: eta: 10 days, 4:47:58 iter: 10600 / 911610 loss: 9.5444 (10.4186) loss_bbox: 2.3800 (2.5877) loss_giou: 3.1670 (3.1760) loss_sted: 2.5954 (3.1175) loss_conf: 0.7750 (0.7916) loss_actioness: 0.7128 (0.7458) time: 1.1276 (0.9781) data: 0.0121 (0.0331) lr: 0.000300 lr_vis_encoder: 0.000020 lr_text_encoder: 0.000050 lr_temp_decoder: 0.000100 max mem: 57450
2025-11-07 13:07:09,509 Video Grounding INFO: eta: 10 days, 4:30:05 iter: 10650 / 911610 loss: 9.9303 (10.1932) loss_bbox: 1.9032 (2.4751) loss_giou: 2.6664 (3.0884) loss_sted: 3.2202 (3.0999) loss_conf: 0.7313 (0.7858) loss_actioness: 0.7098 (0.7439) time: 0.8841 (0.9770) data: 0.0107 (0.0258) lr: 0.000300 lr_vis_encoder: 0.000020 lr_text_encoder: 0.000050 lr_temp_decoder: 0.000100 max mem: 58469
2025-11-07 13:07:59,701 Video Grounding INFO: eta: 10 days, 6:10:10 iter: 10700 / 911610 loss: 9.5725 (10.1094) loss_bbox: 2.0883 (2.4387) loss_giou: 3.0338 (3.0596) loss_sted: 2.5947 (3.0912) loss_conf: 0.7666 (0.7813) loss_actioness: 0.6727 (0.7386) time: 1.1450 (0.9837) data: 0.0115 (0.0222) lr: 0.000300 lr_vis_encoder: 0.000020 lr_text_encoder: 0.000050 lr_temp_decoder: 0.000100 max mem: 58592
2025-11-07 13:08:53,557 Video Grounding INFO: eta: 10 days, 10:49:55 iter: 10750 / 911610 loss: 10.0017 (10.0877) loss_bbox: 2.2094 (2.4483) loss_giou: 3.2749 (3.0776) loss_sted: 2.5815 (3.0434) loss_conf: 0.7737 (0.7837) loss_actioness: 0.7155 (0.7347) time: 1.1630 (1.0024) data: 0.0125 (0.0202) lr: 0.000300 lr_vis_encoder: 0.000020 lr_text_encoder: 0.000050 lr_temp_decoder: 0.000100 max mem: 59675
2025-11-07 13:09:42,115 Video Grounding INFO: eta: 10 days, 9:30:57 iter: 10800 / 911610 loss: 9.6380 (10.0558) loss_bbox: 2.1679 (2.4418) loss_giou: 2.9789 (3.0543) loss_sted: 2.5872 (3.0480) loss_conf: 0.7976 (0.7824) loss_actioness: 0.6423 (0.7293) time: 1.1258 (0.9972) data: 0.0113 (0.0188) lr: 0.000300 lr_vis_encoder: 0.000020 lr_text_encoder: 0.000050 lr_temp_decoder: 0.000100 max mem: 59675
2025-11-07 13:10:33,293 Video Grounding INFO: eta: 10 days, 10:26:45 iter: 10850 / 911610 loss: 10.3820 (10.0376) loss_bbox: 2.4913 (2.4244) loss_giou: 3.2700 (3.0657) loss_sted: 2.7737 (3.0351) loss_conf: 0.8007 (0.7836) loss_actioness: 0.7125 (0.7288) time: 1.0112 (1.0009) data: 0.0115 (0.0178) lr: 0.000300 lr_vis_encoder: 0.000020 lr_text_encoder: 0.000050 lr_temp_decoder: 0.000100 max mem: 59815
2025-11-07 13:11:24,155 Video Grounding INFO: eta: 10 days, 10:56:29 iter: 10900 / 911610 loss: 9.5994 (10.0081) loss_bbox: 2.0103 (2.4021) loss_giou: 3.1463 (3.0632) loss_sted: 2.5920 (3.0301) loss_conf: 0.7915 (0.7831) loss_actioness: 0.7242 (0.7296) time: 1.1069 (1.0030) data: 0.0122 (0.0171) lr: 0.000300 lr_vis_encoder: 0.000020 lr_text_encoder: 0.000050 lr_temp_decoder: 0.000100 max mem: 60400
2025-11-07 13:12:12,421 Video Grounding INFO: eta: 10 days, 9:52:52 iter: 10950 / 911610 loss: 9.9817 (10.0088) loss_bbox: 2.4300 (2.3946) loss_giou: 2.9662 (3.0603) loss_sted: 2.6600 (3.0400) loss_conf: 0.7738 (0.7827) loss_actioness: 0.7340 (0.7311) time: 0.9307 (0.9988) data: 0.0105 (0.0164) lr: 0.000300 lr_vis_encoder: 0.000020 lr_text_encoder: 0.000050 lr_temp_decoder: 0.000100 max mem: 60400
2025-11-07 13:13:00,940 Video Grounding INFO: eta: 10 days, 9:09:23 iter: 11000 / 911610 loss: 10.2663 (10.0345) loss_bbox: 2.1377 (2.3990) loss_giou: 3.1403 (3.0751) loss_sted: 2.6661 (3.0445) loss_conf: 0.7841 (0.7843) loss_actioness: 0.7162 (0.7316) time: 1.0702 (0.9960) data: 0.0120 (0.0159) lr: 0.000300 lr_vis_encoder: 0.000020 lr_text_encoder: 0.000050 lr_temp_decoder: 0.000100 max mem: 60400
2025-11-07 13:13:00,947 Video Grounding INFO: Saving checkpoint to output/hcstvg2/model_011000.pth
2025-11-07 13:13:03,562 Video Grounding INFO: Start validating
2025-11-07 13:13:06,617 Video Grounding INFO: Start evaluation on the val split of HC-STVG dataset
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2000/2000 [49:46<00:00, 1.49s/it]
2025-11-07 14:02:52,705 Video Grounding INFO: Complete the inference on val split of HC-STVG
2025-11-07 14:02:52,713 Video Grounding INFO: ####### Start Calculating the metrics ########
2025-11-07 14:02:55,244 Video Grounding INFO:
gt_viou: 0.2623
tiou: 0.2723
viou: 0.0727
viou@0.3: 0.0300
gt_viou@0.3: 0.3900
viou@0.5: 0.0015
gt_viou@0.5: 0.0860