Skip to content

Question when testing #8

@Confetti-lxy

Description

@Confetti-lxy

Great work! However, when I tried to reproduce the results using the script:
bash eval_scripts/run_LLaDA_gsm8k_base.sh,
for the following part:
accelerate launch --config_file accelerate_config.yaml evaluation_script.py -m lm_eval \ --model LLaDA --tasks gsm8k --batch_size 2 \ --model_args "pretrained=models/LLaDA-8B-Base,prompt_interval_steps=-1,gen_interval_steps=-1,transfer_ratio=0.0,cache_order=0,is_feature_cache=False,is_cfg_cache=False" \ --gen_kwargs "block_length=256,gen_length=256,steps=256,cfg_scale=0 " \ --num_fewshot 4 \ --output_path ./logs/gsm8k_log \ --log_samples \
the following error occurs:
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/user/new/luoxingyu/dLLM-cache/evaluation_script.py", line 10, in
[rank0]: cli_evaluate()
[rank0]: File "/home/user/data/luoxingyu/miniconda3/envs/dllm_cache/lib/python3.12/site-packages/lm_eval/main.py", line 474, in cli_evaluate
[rank0]: results = evaluator.simple_evaluate(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/data/luoxingyu/miniconda3/envs/dllm_cache/lib/python3.12/site-packages/lm_eval/utils.py", line 458, in _wrapper
[rank0]: return fn(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/data/luoxingyu/miniconda3/envs/dllm_cache/lib/python3.12/site-packages/lm_eval/evaluator.py", line 357, in simple_evaluate
[rank0]: results = evaluate(
[rank0]: ^^^^^^^^^
[rank0]: File "/home/user/data/luoxingyu/miniconda3/envs/dllm_cache/lib/python3.12/site-packages/lm_eval/utils.py", line 458, in _wrapper
[rank0]: return fn(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/data/luoxingyu/miniconda3/envs/dllm_cache/lib/python3.12/site-packages/lm_eval/evaluator.py", line 588, in evaluate
[rank0]: resps = getattr(lm, reqtype)(cloned_reqs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/new/luoxingyu/dLLM-cache/eval_model/LLaDA.py", line 782, in generate_until
[rank0]: out = generate(
[rank0]: ^^^^^^^^^
[rank0]: File "/home/user/new/luoxingyu/dLLM-cache/utils/generate_function.py", line 121, in generate
[rank0]: res = model(x, attention_mask=attention_mask)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/data/luoxingyu/miniconda3/envs/dllm_cache/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/data/luoxingyu/miniconda3/envs/dllm_cache/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/.cache/huggingface/modules/transformers_modules/LLaDA_hyphen_8B_hyphen_Base/modeling_llada.py", line 1432, in forward
[rank0]: outputs = self.model.forward(
[rank0]: ^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/.cache/huggingface/modules/transformers_modules/LLaDA_hyphen_8B_hyphen_Base/modeling_llada.py", line 1329, in forward
[rank0]: x, cache = block(x, attention_bias=attention_bias, layer_past=layer_past, use_cache=use_cache)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/data/luoxingyu/miniconda3/envs/dllm_cache/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/data/luoxingyu/miniconda3/envs/dllm_cache/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/.cache/huggingface/modules/transformers_modules/LLaDA_hyphen_8B_hyphen_Base/modeling_llada.py", line 911, in forward
[rank0]: att, cache = self.attention(q, k, v, attention_bias, layer_past=layer_past, use_cache=use_cache)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/.cache/huggingface/modules/transformers_modules/LLaDA_hyphen_8B_hyphen_Base/modeling_llada.py", line 711, in attention
[rank0]: att = self._scaled_dot_product_attention(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/.cache/huggingface/modules/transformers_modules/LLaDA_hyphen_8B_hyphen_Base/modeling_llada.py", line 653, in _scaled_dot_product_attention
[rank0]: return F.scaled_dot_product_attention(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: RuntimeError: The size of tensor a (1212) must match the size of tensor b (956) at non-singleton dimension 3

However, the tests for the remaining two modified versions can be completed normally. What might be the cause of this issue?
Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions