Skip to content

qwen3-32b 模型启动失败 - RuntimeError: Execute fail #24

@byixi2023

Description

@byixi2023

问题描述

使用 xw start qwen3-32b --engine mindie:docker 启动 qwen3-32b 模型时,在 warm-up 阶段出现执行失败错误,导致服务无法正常启动。

复现步骤

  1. 安装 xw-cli
  2. 运行命令:xw start qwen3-32b --engine mindie:docker
  3. 服务在初始化过程中失败

错误日志

2026-02-13T05:15:39.398500690Z     acl_model_out = model_operation.execute(acl_inputs, acl_param)
2026-02-13T05:15:39.398503764Z                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-13T05:15:39.398506742Z RuntimeError: Execute fail, enable log: export ASCEND_GLOBAL_LOG_LEVEL=3, export ASCEND_SLOG_PRINT_TO_STDOUT=1 to find the first error. For more details, see the MindIE official document.

完整日志片段

2026-02-13T05:14:36.092278337Z ConfigManager: Load Config from /usr/local/Ascend/mindie/2.3.0/mindie-service/conf/config.json.
2026-02-13T05:14:36.093671379Z The configName speculationGamma is not found, use default value.
2026-02-13T05:14:41.245377454Z DynamicConfigHandler exception: [json.exception.out_of_range.403] key 'EnableDynamicAdjustTimeoutConfig' not found
2026-02-13T05:15:39.398479168Z   File "/usr/local/Ascend/atb-models/atb_llm/models/base/flash_causal_lm.py", line 530, in forward
2026-02-13T05:15:39.398481946Z     logits = self.execute_ascend_operator(acl_inputs, acl_param, is_prefill)
2026-02-13T05:15:39.398494446Z              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-13T05:15:39.398497352Z   File "/usr/local/Ascend/atb-models/atb_llm/models/qwen2/flash_causal_qwen2.py", line 629, in execute_ascend_operator
2026-02-13T05:15:39.398500690Z     acl_model_out = model_operation.execute(acl_inputs, acl_param)
2026-02-13T05:15:39.398503764Z                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-13T05:15:39.398506742Z RuntimeError: Execute fail, enable log: export ASCEND_GLOBAL_LOG_LEVEL=3, export ASCEND_SLOG_PRINT_TO_STDOUT=1 to find the first error. For more details, see the MindIE official document.

环境信息

  • 系统:OpenEuler 24.03 LTS (arm64)
  • Docker 版本:18.09.0
  • xw-cli 版本:Client Version:
    Version: 0.0.1
    Build Time: 2026-02-12T15:26:31Z

Server Version:
Version: 0.0.1
Build Time: 2026-02-13T12:13:24+08:00

  • MindIE 版本:2.3.0
  • NPU 型号:Ascend 300I Duo
  • Docker 镜像:harbor.tsingmao.com/xw-cli/mindie:2.3.0-300I-Duo-py311-openeuler24.03-lts-arm64

其他相关错误

  1. 配置文件错误

    DynamicConfigHandler exception: [json.exception.out_of_range.403] key 'EnableDynamicAdjustTimeoutConfig' not found
    
  2. 配置项缺失

    The configName speculationGamma is not found, use default value.
    
  3. 环境变量弃用警告

    The old environment variable MINDIE_LLM_LOG_LEVEL will be deprecated on 2025/12/31.
    

期望行为

  • 模型能够成功启动并正常运行
  • 配置文件能够正确加载,无缺失配置项错误
  • 环境变量警告能够正确处理

已尝试的解决方案

  • 重新拉取 Docker 镜像
  • [*] 检查 NPU 驱动版本
  • [*

temp.md

] 检查模型文件完整性

附加信息

  • 这是多次启动该模型失败,同一个容器,运行qwen3-8b模型成功
  • NPU 设备状态正常(npu-smi info 无错误)
  • 系统内存充足

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions