-
Notifications
You must be signed in to change notification settings - Fork 63
Description
- When read with
fromJSONmethod, big integer literals are not recorded as explict R integers. - When written with
toJSONmethod, big integers are treated as numeric and may lose precision.
Reproducing the issue
> RJSONIO::toJSON(12345)
[1] "[ 12345 ]"
> RJSONIO::toJSON(123456)
[1] "[ 123456 ]"
> RJSONIO::toJSON(1234567)
[1] "[ 1234567 ]"
> RJSONIO::toJSON(12345678)
[1] "[ 1.234568e+07 ]"
> RJSONIO::toJSON(12345678, digits = 23) # Workaround
[1] "[ 12345678 ]"
>Possible cause
String conversion by formatC with default (line 164) value of digits = 5:
Line 173 in ec0dd20
| tmp = formatC(x, digits = digits) # , format = "f") |
I am not sure why the code path for "numeric" is chosen for integer inputs. May be because R itself is very picky of what "integers" are.
> is.integer(5)
[1] FALSE
> five <- as.integer(5)
> is.integer(five)
[1] TRUE
> five
[1] 5
> 5 == five # However!
[1] TRUE
> RJSONIO::toJSON(as.integer(12345678)) # Another workaround
[1] "[ 12345678 ]"
>Could it be intentional?
Tests tiptoe around the issue by only requiring parsed big integers to pass is.numeric, although C bigints can be R integers too -- and users would expect them to be.
https://github.com/duncantl/RJSONIO/blob/ec0dd20fb0841aff06ce33545441d34b51ab49cc/tests/bigInt.R
As JSON originally comes from Javascript land where everything is a hand-wavy numeric, distinction between integers and non-integers is not defined. However, I would argue that keeping verifiable integers integer is a reasonable expectation. There is less harm in accidentally coercing 2.00 into 2, than it is with coercing big integers into floats (see RWI below).
Javascript (both browser V8 and Node), Python's native json library and R jsonlite all keep big integers integer.
Real world impact
This causes issues when RJSONIO is used to ingest and then pass forward JSON objects containing integer fields with high numbers.
R is a language of choice in OHDSI community, and RJSONIO is used to parse and pipe API outputs in some cases:
OHDSI/ROhdsiWebApi#152 - fixed with a workaround.
Workarounds
- For
toJSON, supply a high enough value fordigitsto have formatC keep things "implicitly integer"; - Explicitly convert all nested nodes with
as.integer(but in this case, one would expect fields generated byfromJSONto already beis.integer);