-
Notifications
You must be signed in to change notification settings - Fork 71
Description
Hello, I've noticed something peculiar.
I ran the ResNet50 and ResNet100 network on RV1126. The network file was converted from ONNX to RKNN file quantized to FP16/INT8 using the rknn visualization toolkit.
I used the rknpu C API on RV1126 and loaded the file using rknn_init.
I measured the time taken by rknn_init, and it's taking much longer than expected to load ResNet.
However, when tested on other rockchip devices like RK3568, initialization completes within a few seconds.
[rknn_init working time at RV1126]
Loading Resnet100(FP16): avg 37.9 sec
Loading Resnet100(INT8): avg 131.6 sec
Loading Resnet50(INT8): avg 81.0 sec
[rknn_init working time at RK3568]
Loading Resnet100(FP16): avg 6.1 sec
Loading Resnet100(INT8): avg 3.4 sec
I understand that RK3568 has a higher-performance processor than RV1126.
However, there is a significant difference in initialization time, with a 6x difference for FP16 and a 92x difference for INT8. Additionally, it's peculiar that loading an INT8 model takes longer than FP16 on RV1126.
What could be the cause of this phenomenon?
Is it due to the performance difference between CPUs (A7 vs A55)? Or could it be a problem during model conversion or when using the rknpu API?
This is part of my code for loading
std::string model_rk = r50.rknn;
std::ifstream in_file(model_rk, std::ios::binary);
std::streampos begin, end;
in_file.seekg(0, std::ios::end);
end = in_file.tellg();
size_t size = end - begin;
in_file.seekg(0, std::ios::beg);
engine_data.resize(size, 'c');
in_file.read((char*)engine_data.data(), size);
in_file.close();
const char* param_m = reinterpret_cast<const char*>(engine_data.c_str());
ret = rknn_init(&this->ctx, (void*)param_m, size, 0, NULL);
if(ret < 0)
{
printf("%s - Fail to create p-model : %d\n", __FUNCTION__, ret);
return false;
}Please provide an explanation for this phenomenon.