-
Notifications
You must be signed in to change notification settings - Fork 40
Open
Description
python -m sglang.launch_server --model-path $MODEL_PATH
--host 127.0.0.1 --port 7439 --trust-remote-code --nnodes 1 --node-rank 0
--attention-backend ascend --device npu
--tp-size 1 --mem-fraction-static 0.8 --cuda-graph-max-bs 64 --dtype bfloat16
捕12张图,压测gsm8k精度时,数据集总量为1319条,在第1255条卡死;
只捕一张图bs=64,压测gsm8k精度通过
Metadata
Metadata
Assignees
Labels
No labels