-
Notifications
You must be signed in to change notification settings - Fork 83
Description
Unfortunately I am not a Python pro so I don't dare to analyze the whole code of DepthCrafter but I feel it should be less memory-hungry with longer videos.
As far as I understand, the current workflow of DC is:
- load all frames to memory
- pass the frames array to "DC pipeline"
- run inference (with a moving window of 110 frames)
- save all results
It works fine unless the number of frames is too big.
E.g. it takes 5 minutes to munch 100 frames, but it could take several hours to munch 800 frames while theoretically the dependency should be almost linear (except the overlapping parts) and it should take about 40 minutes (8 x 5minutes).
The inference of frames runs about the same speed but there is a variable delay at the beginning of the pipeline (before tqdm pops up). I suppose it is caused by some memory swapping between RAM and VRAM but I am not sure about it.
I suppose it could be avoided by not loading all frames at once but maybe there is a catch why it can't be done like this...?