-
Notifications
You must be signed in to change notification settings - Fork 14
Question when testing #8
Description
Great work! However, when I tried to reproduce the results using the script:
bash eval_scripts/run_LLaDA_gsm8k_base.sh,
for the following part:
accelerate launch --config_file accelerate_config.yaml evaluation_script.py -m lm_eval \ --model LLaDA --tasks gsm8k --batch_size 2 \ --model_args "pretrained=models/LLaDA-8B-Base,prompt_interval_steps=-1,gen_interval_steps=-1,transfer_ratio=0.0,cache_order=0,is_feature_cache=False,is_cfg_cache=False" \ --gen_kwargs "block_length=256,gen_length=256,steps=256,cfg_scale=0 " \ --num_fewshot 4 \ --output_path ./logs/gsm8k_log \ --log_samples \
the following error occurs:
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/user/new/luoxingyu/dLLM-cache/evaluation_script.py", line 10, in
[rank0]: cli_evaluate()
[rank0]: File "/home/user/data/luoxingyu/miniconda3/envs/dllm_cache/lib/python3.12/site-packages/lm_eval/main.py", line 474, in cli_evaluate
[rank0]: results = evaluator.simple_evaluate(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/data/luoxingyu/miniconda3/envs/dllm_cache/lib/python3.12/site-packages/lm_eval/utils.py", line 458, in _wrapper
[rank0]: return fn(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/data/luoxingyu/miniconda3/envs/dllm_cache/lib/python3.12/site-packages/lm_eval/evaluator.py", line 357, in simple_evaluate
[rank0]: results = evaluate(
[rank0]: ^^^^^^^^^
[rank0]: File "/home/user/data/luoxingyu/miniconda3/envs/dllm_cache/lib/python3.12/site-packages/lm_eval/utils.py", line 458, in _wrapper
[rank0]: return fn(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/data/luoxingyu/miniconda3/envs/dllm_cache/lib/python3.12/site-packages/lm_eval/evaluator.py", line 588, in evaluate
[rank0]: resps = getattr(lm, reqtype)(cloned_reqs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/new/luoxingyu/dLLM-cache/eval_model/LLaDA.py", line 782, in generate_until
[rank0]: out = generate(
[rank0]: ^^^^^^^^^
[rank0]: File "/home/user/new/luoxingyu/dLLM-cache/utils/generate_function.py", line 121, in generate
[rank0]: res = model(x, attention_mask=attention_mask)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/data/luoxingyu/miniconda3/envs/dllm_cache/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/data/luoxingyu/miniconda3/envs/dllm_cache/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/.cache/huggingface/modules/transformers_modules/LLaDA_hyphen_8B_hyphen_Base/modeling_llada.py", line 1432, in forward
[rank0]: outputs = self.model.forward(
[rank0]: ^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/.cache/huggingface/modules/transformers_modules/LLaDA_hyphen_8B_hyphen_Base/modeling_llada.py", line 1329, in forward
[rank0]: x, cache = block(x, attention_bias=attention_bias, layer_past=layer_past, use_cache=use_cache)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/data/luoxingyu/miniconda3/envs/dllm_cache/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/data/luoxingyu/miniconda3/envs/dllm_cache/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/.cache/huggingface/modules/transformers_modules/LLaDA_hyphen_8B_hyphen_Base/modeling_llada.py", line 911, in forward
[rank0]: att, cache = self.attention(q, k, v, attention_bias, layer_past=layer_past, use_cache=use_cache)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/.cache/huggingface/modules/transformers_modules/LLaDA_hyphen_8B_hyphen_Base/modeling_llada.py", line 711, in attention
[rank0]: att = self._scaled_dot_product_attention(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/user/.cache/huggingface/modules/transformers_modules/LLaDA_hyphen_8B_hyphen_Base/modeling_llada.py", line 653, in _scaled_dot_product_attention
[rank0]: return F.scaled_dot_product_attention(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: RuntimeError: The size of tensor a (1212) must match the size of tensor b (956) at non-singleton dimension 3
However, the tests for the remaining two modified versions can be completed normally. What might be the cause of this issue?
Thanks!