Improve parsing performance by avoiding String reallocations#62
Improve parsing performance by avoiding String reallocations#62kornelski merged 1 commit intokornelski:mainfrom
Conversation
We gain an about 10.7% performance improvement in the xml-rs integration in Chromium in blink_perf.parser xml parsing benchmark with this change. [1] The problem in particular with the PullParser.buf is that after .take_buf() the buffer is reset to an empty String, and then single character or token pushes follow, leading to a lot of reallocations. Instead, reset to an allocated buffer of size 20, which is choosen somewhat arbitraily as a compromise between covering many XML name length and avoiding too large temporary allocations. Similarly, keep AttributeSet larger instead of needing reallocations to grow it. Accounts for about 1-2% of the perf improvement. [1] https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/perf_tests/parser/xml-parser.html?q=xml-parser.html
|
CC @sesse |
|
If you're looking for perf improvements, the massive bottleneck is in the lexer reading chars one by one going all the way through the state machine. This needs to be refactored to allow it to have an inner tight loop collecting entire strings at once. The best approach to would be to refactor states from being enums to being functions that consume a slice. https://github.com/cloudflare/lol-html/blob/main/src/parser/state_machine/mod.rs lol-html obscures this with some clever macros, but at the core it's all about the current state of the parser being code not data. The code can ask for specific input, so it only needs to locally branch on that (loop until |
|
Thanks for the quick merge and the additional performance considerations/suggestions. Could you please tag a release? (That enables us to pull this change into Chromium.) |
Changes merged in upstream but no version released yet. Covers kornelski/xml-rs#62 Bug: 470367156 Change-Id: I9e778313b1f7fd88e96419879125dfd30b79dca0 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/7492973 Commit-Queue: Łukasz Anforowicz <lukasza@chromium.org> Reviewed-by: Łukasz Anforowicz <lukasza@chromium.org> Auto-Submit: Dominik Röttsches <drott@chromium.org> Cr-Commit-Position: refs/heads/main@{#1571637}
Changes merged in upstream but no version released yet. Covers kornelski/xml-rs#62 Bug: 470367156 Change-Id: I9e778313b1f7fd88e96419879125dfd30b79dca0 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/7492973 Commit-Queue: Łukasz Anforowicz <lukasza@chromium.org> Reviewed-by: Łukasz Anforowicz <lukasza@chromium.org> Auto-Submit: Dominik Röttsches <drott@chromium.org> Cr-Commit-Position: refs/heads/main@{#1571637} NOKEYCHECK=True GitOrigin-RevId: 89730e150cebf10780b580196850b04019935b93
We gain an about 10.7% performance improvement in the xml-rs integration in Chromium in blink_perf.parser xml parsing benchmark with this change. [1]
The problem in particular with the PullParser.buf is that after .take_buf() the buffer is reset to an empty String, and then single character or token pushes follow, leading to a lot of reallocations.
Instead, reset to an allocated buffer of size 20, which is choosen somewhat arbitraily as a compromise between covering many XML name length and avoiding too large temporary allocations.
Similarly, keep AttributeSet larger instead of needing reallocations to grow it. Accounts for about 1-2% of the perf improvement.
[1] https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/perf_tests/parser/xml-parser.html?q=xml-parser.html