-
Notifications
You must be signed in to change notification settings - Fork 57
Why the compressed model using TT is slower than the non-compressed model? #222
Description
I am trying to factorize the LeNet300 model (including only 3 FC layers (784x300), (300x100), (100x10)). I have factorized only the first layer with shape 784x300 using t3f.
After fine-tuning I have good results in tense of accuracy.
Also using this I compressed the mode from 266610 params to 49170 params (about 81% compression).
But results are not good when I tried to get execution time.
execution time for 10 times prediction over the test data (includes 10000 data images) is as follows:
baseline model (without factorization) = 5.51 s
factorized model = 5.57 s
factorization configuration is: 784x300 ----> [[2, 392], [20, 15]] and max_tt_rank = 3
while the FLOPs for the baseline model is: 532810 FLOPs
and for factorized model is: 116486 FLOPs (about 78% decrease FLOPs)
I should mention that I calculate the FLOPs for factorized layer using this link from you:
https://colab.research.google.com/drive/16S_SUbIjhnQBFj_r7sCpbwZNHADIzEwX?usp=sharing
Also to calculate FLOPs for non-factorized layer I use this correlation:
2 * (input_dim * output_dim) + outputdim
What is the problem that we decrease the number of FLOPs but get worse results than baseline?