Randomly chosen image from Figure 1

In the paper's Figure 1, it seemed that the input for feature extractor G_f is "Randomly chosen image".
However, in the released code, it seems that input is just the input x for both G_f for content feature (the green module in Figure 1) and G_f for style feature (the grey module in Figure 1).
May I ask for clarification why there is discrepancy between the paper and the code?