mobilenetv3 int8 cannot run on NPU

<img width="652" height="16246" alt="Image" src="https://github.com/user-attachments/assets/c537747c-16ba-4e10-9369-f031548c5e1b" />


this is the onnx of my model, a very simple model.
I use the code the tutorial provided to quantize to int8

`quant_config = get_default_config("XINT8")
    config = Config(global_quant_config=quant_config)
    print(f"The configuration for quantization is {config}")

    # Create an ONNX quantizer
    quantizer = ModelQuantizer(config)

    # Quantize the ONNX model
    quantizer.quantize_model(input_model_path, output_model_path, dr)`

But why the quantized model can't all run on the NPU, lots of ops still run on the CPU......

--------------------------------------------------- TEST SUMMARY 
model mobilenetv3_int8_BS1 
threads 1 APU STX 
--------------------------------------------------- PERFORMANCE
 throughput 15.25 [fps] 
latency 65.48 [ms] 
processed images 200 preloaded images 100 requested images 100 
---------------------------------------------------NODES DISTRIBUTION 
total nodes 519 
CPU 54 
NPU 353 
VITIS_EP_CPU 112


I also try the example you provided in "advanced_quark_quantize.py", the mobilenetv2 can't all run on the NPU as well...... Only the resnet18 can all run on the NPU....
 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

mobilenetv3 int8 cannot run on NPU #310

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

mobilenetv3 int8 cannot run on NPU #310

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions