Is the qnn model use mixed precision quantization?

Hi, I find when I use the python converter.py                                 \
    --model-folder /work/.cache/modelscope/hub/qwen/Qwen2___5-0___5B-Instruct/            \
    --model-name qwen2_0.5b                        \
    --system-prompt-file /work/code/projects/PowerServe/assets/system_prompts/qwen2.txt    \
    --prompt-file /work/code/projects/PowerServe/assets/prompts/long_prompt.txt               \
    --batch-sizes 1 128                             \
    --artifact-name qwen2_0.5b                     \
    --n-model-chunk 1                               \
    --output-folder ./qwen2.5-0.5b-QNN               \
    --build-folder ./qwen2.5-0.5b-QNN-tmp            \
    --soc sa8775

try to convert the Qwen2___5-0___5B-Instruct/ to qnn target. but I find the messy result. Did you use fp16 and int4 weight when convert onnx to qnn bin (use qnn default quant method)?


Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Is the qnn model use mixed precision quantization? #7

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Is the qnn model use mixed precision quantization? #7

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions