Skip to content

My naive backend implementation is toooooooooooooooooo slow #3

@serihiro

Description

@serihiro

Measurement results with macbook pro 2015 Early (3.1 GHz Intel Corei7)

time ./relesae_imagenet_vgg19.o -i ../data/imagenet/n01443537_693.JPEG -m ../data/vgg19/model.onnx 
inference result
1   :   0.0286738
973   :   0.0252736
397   :   0.0235154
392   :   0.0227232
115   :   0.0138855
393   :   0.0132342
390   :   0.0118524
983   :   0.0102448
0   :   0.00987771
396   :   0.00940968
./relesae_imagenet_vgg19.o -i ../data/imagenet/n01443537_693.JPEG -m   29.57s user 1.17s system 96% cpu 31.827 total

😢

The result by onnxruntime with the same model and the same image

venv ❯ time python vgg19_test.py -i n01443537_693.JPEG
0: (1, 0.9999995)
1: (392, 3.7388313e-07)
2: (0, 1.02450905e-07)
3: (393, 2.3306047e-08)
4: (391, 1.8048546e-08)
5: (973, 1.2090326e-08)
python vgg19_test.py -i n01443537_693.JPEG  1.38s user 1.21s system 79% cpu 3.262 total
import argparse
import cv2
import numpy as np
import onnxruntime

parser = argparse.ArgumentParser()
parser.add_argument('--image', '-i', type=str, required=True)
args = parser.parse_args()

img = cv2.imread(args.image)
img = img.astype(np.float32)
mean = np.array([103.939, 116.779, 123.68])
img -= mean
img = cv2.resize(img, (224, 224))
img = np.expand_dims(img.transpose(2, 0, 1), 0).astype(np.float32)

session = onnxruntime.InferenceSession('vgg19.onnx')
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name
output = session.run([output_name], {input_name: img})[0]

result = sorted(enumerate(output[0]), key=lambda x: x[1], reverse=True)

for i in range(0, 6):
    print(f'{i}: {result[i]}')

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions