float_quantize at multi-gpu works wrong.

When I run model after applying float_quantize to weight or activation with multi-GPU,
(huggingface opt-model with device_map='auto')
quantization of layers allocated to second or later gpu works wrong.
The output of quantization shows mostly 0-value.