Skip to content

Reproduction results about FLUX #32

@leaves162

Description

@leaves162

Great work! While reproducing the results, I found that FLUX could not achieve the reported performance on DrawBench200 as stated in the paper (N=6, O=1, ImageReward=0.9953, CLIP=19.637). Since you didn’t provide the complete evaluation code, I’m not sure whether the issue lies in my own implementation.

I used the following command for generation:

CUDA_VISIBLE_DEVICES=0 python src/sample.py
--prompt_file DrawBench200.txt
--width 1024
--height 1024
--model_name flux-dev
--add_sampling_metadata
--output_dir ./results
--num_steps 50

I evaluated the generated images using ImageReward (ImageReward-v1.0 version) and CLIPScore (from torchmetrics.multimodal.clip_score, with the openai/clip-vit-large-patch14 version).
The final results I obtained were: ImageReward = 0.9410, CLIP = 17.1656, which show a significant gap compared to the reported results.
I'm not sure if there's anything wrong with this implementation or other factors I may have overlooked.

Looking forward to your response. Thank you again for your outstanding work!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions