About the selection of input video pairs #15

I have an idea. For the same prompt, can I generate videos on wan1.3B and wan14B respectively, and then use the video generated by 14B to DPO optimize the 1.3B one? I don't know if it is feasible. Or when doing DPO, must the video generated by the current model be used?