Skip to content

[arrow-cast] consider simplifying parse_decimal and parse_e_notation #9170

@gruuya

Description

@gruuya

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Currently there isn't a clear separation of concerns between parse_decimal and parse_e_notation, as a part of the former functions logic leaks into the latter one.

Namely, to paraphrase from here

in [parse_decimal] we skip parsing any fractionals after we reach scale digits, not knowing ahead of time whether the decimal contains an e-notation or not. So once we do hit into an e-notation, and drop down into [parse_e_notation], we need to parse the remaining unprocessed fractionals too, since otherwise we might lose precision.

Besides making this cognitively complex, it also leads to some preventable edge cases, such as this one #8700 (comment)

Describe the solution you'd like
One option (as suggested in the previously linked comment) would be to do s.split_once(['e', 'E']) at the start of parse_decimal, calling parse_e_notation optionally when e-notation is detected, and then proceeding to parse fractionals from the input decimal, at this point knowing exactly how many digits we'll need to retain.

Describe alternatives you've considered
Alternatively, there probably exists a one-pass algorithm that instead of parsing forward parses the decimal backward.

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementAny new improvement worthy of a entry in the changelog

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions