Reproduction results about FLUX

Great work! While reproducing the results, I found that FLUX could not achieve the reported performance on DrawBench200 as stated in the paper (**N=6, O=1, ImageReward=0.9953, CLIP=19.637**). Since you didn’t provide the complete evaluation code, I’m not sure whether the issue lies in my own implementation.

I used the following command for generation:
>CUDA_VISIBLE_DEVICES=0 python src/sample.py \
    --prompt_file DrawBench200.txt \
    --width 1024 \
    --height 1024 \
    --model_name flux-dev \
    --add_sampling_metadata \
    --output_dir ./results \
    --num_steps 50

I evaluated the generated images using  [ImageReward](https://github.com/zai-org/ImageReward/tree/main) (`ImageReward-v1.0` version) and CLIPScore (`from torchmetrics.multimodal.clip_score`, with the `openai/clip-vit-large-patch14` version).
The final results I obtained were: **ImageReward = 0.9410, CLIP = 17.1656**, which show a significant gap compared to the reported results.
I'm not sure if there's anything wrong with this implementation or other factors I may have overlooked.

Looking forward to your response. Thank you again for your outstanding work!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproduction results about FLUX #32

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Reproduction results about FLUX #32

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions