Skip to content

LM Eval Tasks not found #143

@AahilA

Description

@AahilA

Hi it seems that any of the tasks from the lm_eval benchmarks trigger "Task not found". I have attached the warning log and used the format from the README.md (I think we can ignore the fact I have no access to the underlying Mistral model as I have tired this with other models I can run and it warns me the same). I have been able to run other benchmarks such as GPQADiamond or MATH500, but anything from lm_eval seems to trigger this warning.

evalchemy                          0.1.0           /home/aahila/evalchemy
lm_eval                            0.4.7
vllm                               0.10.2
(myenv) aahila$ python -m eval.eval \
    --model hf \
    --tasks mmlu \
    --model_args "pretrained=mistralai/Mistral-7B-Instruct-v0.3" \
    --batch_size 2 \
    --output_path logs
INFO 09-18 21:50:10 [__init__.py:216] Automatically detected platform cuda.
WARNING: HF_HUB_CACHE environment variable is not set, using default cache directory ~/.cache/huggingface/hub for database utils
2025-09-18:21:50:12,384 WARNING  [task.py:155] OPENAI_API_KEY not set. Tasks requiring OpenAI will be skipped.
2025-09-18:21:50:20,720 INFO     [eval.py:381] Selected Tasks: ['mmlu']
2025-09-18:21:50:20,720 WARNING  [task.py:309] Task not found: mmlu
2025-09-18:21:50:20,974 INFO     [huggingface.py:133] Using device 'cuda'
2025-09-18:21:50:21,106 ERROR    [eval.py:404] Failed to initialize model: You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3.
401 Client Error. (Request ID: Root=1-68cc7e9d-20e392fc208f834f5dd24e67;8fa548b6-be1c-486e-af1f-129a2fc3b1c1)

Cannot access gated repo for url https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3/resolve/main/config.json.
Access to model mistralai/Mistral-7B-Instruct-v0.3 is restricted. You must have access to it and be authenticated to access it. Please log in.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions