Skip to content

graphcore-research/gfloat

Repository files navigation

gfloat: Generic floating-point types in Python

An implementation of generic floating point encode/decode logic, handling various current and proposed floating point types:

The library favours readability and extensibility over speed (although the *_ndarray functions are reasonably fast for large arrays, see the benchmarking notebook). For other implementations of these datatypes more focused on speed see, for example, ml_dtypes, bitstring, MX PyTorch Emulation Library.

See https://gfloat.readthedocs.io for documentation, or dive into the notebooks to explore the formats.

For example, here's a table from the 02-value-stats notebook:

name B: Bits in the format P: Precision in bits E: Exponent field width in bits 0<x<1 1<x<Inf minSubnormal maxSubnormal minNormal maxNormal Exact in float16? Exact in float32?
p3109_k3p2sf 3 2 1 1 1 0.5 0.5 1 1.5 True True
ocp_e2m1 4 2 2 1 5 0.5 0.5 1 6 True True
p3109_k4p2sf 4 2 2 3 3 0.25 0.25 0.5 3 True True
ocp_e2m3 6 4 2 7 23 0.125 0.875 1 7.5 True True
ocp_e3m2 6 3 3 11 19 0.0625 0.1875 0.25 28 True True
p3109_k6p3sf 6 3 3 15 15 0.03125 0.09375 0.125 14 True True
p3109_k6p4sf 6 4 2 15 15 0.0625 0.4375 0.5 3.75 True True
ocp_e4m3 8 4 4 55 70 2^-9 7/4*2^-7 0.015625 448 True True
ocp_e5m2 8 3 5 59 63 2^-16 3/2*2^-15 2^-14 57344 True True
p3109_k8p1se 8 1 7 63 62 n/a n/a 2^-63 2^62 False True
p3109_k8p1ue 8 1 8 127 125 n/a n/a 2^-127 2^125 False True
p3109_k8p3se 8 3 5 63 62 2^-17 3/2*2^-16 2^-15 49152 True True
p3109_k8p3sf 8 3 5 63 63 2^-17 3/2*2^-16 2^-15 57344 True True
p3109_k8p3ue 8 3 6 127 125 2^-33 3/2*2^-32 2^-31 5/4*2^31 False True
p3109_k8p3uf 8 3 6 127 126 2^-33 3/2*2^-32 2^-31 3/2*2^31 False True
p3109_k8p4se 8 4 4 63 62 2^-10 7/4*2^-8 0.0078125 224 True True
p3109_k8p4sf 8 4 4 63 63 2^-10 7/4*2^-8 0.0078125 240 True True
p3109_k8p4ue 8 4 5 127 125 2^-18 7/4*2^-16 2^-15 53248 True True
p3109_k8p4uf 8 4 5 127 126 2^-18 7/4*2^-16 2^-15 57344 True True
p3109_k8p7sf 8 7 1 63 63 0.015625 63/32*2^-1 1 127/64*2^0 True True
p3109_k8p8uf 8 8 1 127 126 0.0078125 127/64*2^-1 1 127/64*2^0 True True
binary16 16 11 5 15359 16383 2^-24 1023/512*2^-15 2^-14 65504 True True
bfloat16 16 8 8 16255 16383 2^-133 127/64*2^-127 2^-126 255/128*2^127 False True
ocp_e8m0 8 1 8 127 127 n/a n/a 2^-127 2^127 False True
ocp_int8 8 8 0 63 63 0.015625 127/64*2^0 n/a n/a True True

Notes

All NaNs are the same, with no distinction between signalling or quiet, or between differently encoded NaNs.

About

Generic floating-point types in Python

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •