-
Notifications
You must be signed in to change notification settings - Fork 15
confusion on field of view and model inference time #8
Copy link
Copy link
Open
Description
Hi, RolandGao, nice to see a good job! I see you've done a lot of experiments on the backbone setting, but I still have some confusion after reading your published paper.
- First, You calculate the fov of 4095 to see the bottom-right pixel when training cityscape (1024x2048), so you have verify the backbone should be exp48 [ (1,1) + (1,2) + 4 * (1, 4) + 7 *(1, 14) ] with fov (3807). But I also find the same backbone when training the CamVid (720x960). Why not use a shallow backbone? I am training my own dataset with image resolution (512 x 512), do I need to modify the backbone architecture? Can you give some advice?
- Second, I test inference time of regseg. I notice that the speed is not better than other real-time archs due to split and dilated conv even if model costs low GFLOPs. In the application, what we are concerned about is the speed, so is there any strategy to improve the speed?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels