Skip to content

如何用llm-benchmark测试xinference起来的qwen3 #3

@kanghongj

Description

@kanghongj

我的qwen3在xinfercne启动,API格式为:
http://XX.XX.XX.XX:9997/qwen3/

按照说明部署llm-benchmar后,发送命令:
python run_benchmarks.py
--llm_url "http://0.0.0.0:9997/qwen3/"
--model "qwen3"
--use_long_context

报错:
2025-05-12 14:50:16,736 - INFO - Starting request 0
2025-05-12 14:50:16,842 - INFO - HTTP Request: POST http://0.0.0.0:9997/qwen3/chat/completions "HTTP/1.1 404 Not Found"
2025-05-12 14:50:16,842 - ERROR - Error during request: Error code: 404 - {'detail': 'Not Found'}
2025-05-12 14:50:16,842 - WARNING - Request 0 failed

可以看到API被加了openai的后缀/chat/completions,但我的本地模型是xinference部署启动的,没有这个后缀。

请问怎么解决。

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions