-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Description
Hello, I was able to evaluate both LLava 1.5 13B and MoF 13B on MMVP benchmark and it seems to me I cant reproduce the MoF results, my results turn out to be much lower at around 21%, while LLava I could reproduce the results and with a minor fix it is at 39.3%. I am wondering what would be the issue.
I am running using the HuggingFace weights with the following command
python scripts/evaluate_mllm.py --directory MMVP/ --model-path MMVP/MoF_Models
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels