Skip to content

Negated char class : sacrifice 1 or 2 tokens to improve range merge #90

@Siorki

Description

@Siorki

There is a possible optimisation in the Packer, when merging character ranges to build the negated char class.

If the last token only improves the compression by 1 byte (token gain = 1), and that removing one extra character would merge two ranges (gain 3), then not performing that compression step would free the token and actually improve the final result by 2 bytes.

The algorithm should therefore :

  • identify the leaf tokens (those with no further dependency on them) with the least gain
  • when merging range, measure the gain given by one extra token.

If the gain from the merge exceeds the one from the token, remove the token, merge further, and perform another check.

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions