MobileBERT tflite int8 model seems not follow quantization spec

The model downloaded from https://github.com/fatihcakirs/mobile_models/blob/main/v0_7/tflite/mobilebert_int8_384_20200602.tflite

Some Fully-connected weights has none-zero zero point (ex. weight `bert/encoder/layer_0/attention/self/MatMul19` has zero-point = 6) , which violate the [TFLite quantization spec](https://www.tensorflow.org/lite/performance/quantization_spec).

I am afraid this might cause issues on some implementation which bypass the FC weight zero-point calculation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MobileBERT tflite int8 model seems not follow quantization spec #21

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

MobileBERT tflite int8 model seems not follow quantization spec #21

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions