Skip to content

Conversation

@bluetech
Copy link
Member

Note: only last 2 commits are relevant.

This is a proof of concept I quickly hacked together to switch the AST allocation to use a bump allocator. This means we just allocate AST nodes in big chunks and then free it all together once done.

This is just a proof of concept, will need more work if we want to merge it. But enough to benchmark:

bench/rulescomp
before:
total heap usage: 10,980,536 allocs, 10,980,536 frees, 490,563,213 bytes allocated
compiled 1000 keymaps in 2.300165s

after:
total heap usage: 4,647,536 allocs, 4,647,536 frees, 538,021,213 bytes allocated
compiled 1000 keymaps in 1.949309s

The reduction in allocs is very nice, but the speedup is less than I hoped.

@wismill
Copy link
Member

wismill commented Jan 25, 2025

Nice! This could also help #539: it would be easier to cache the AST by just cloning the corresponding arena.

Signed-off-by: Ran Benita <ran@unusedvar.com>
If someone needs this, they can let us know...

Signed-off-by: Ran Benita <ran@unusedvar.com>
The Windows dllexport annotation wants to be on the declarations, not
the definitions.

Signed-off-by: Ran Benita <ran@unusedvar.com>
Without this, the test-internal libraries (which don't use the .def file
because they contain additional private symbols) can't be made shared.
But it also makes sense for consistency with GCC.

Signed-off-by: Ran Benita <ran@unusedvar.com>
This makes the tests, and especially benches, more realistic, since
xkbcommon is almost always used as a shared library.

Also significantly reduces the build time with LTO enabled (for me, from
90s to 30s).

Signed-off-by: Ran Benita <ran@unusedvar.com>
The AST is heavily based on intrusive lists for representing lists, but
actions are an exception, instead using darray. I don't see any reason
for this; it ends up allocating more, and we don't especially need a
flat array for this.

Change it to use the familiar linked-list style.

Signed-off-by: Ran Benita <ran@unusedvar.com>
This is similar to the previous commit, for keysym lists.

Signed-off-by: Ran Benita <ran@unusedvar.com>
Signed-off-by: Ran Benita <ran@unusedvar.com>
Signed-off-by: Ran Benita <ran@unusedvar.com>
bench/rulescomp
before:
total heap usage: 10,980,536 allocs, 10,980,536 frees, 490,563,213 bytes allocated
compiled 1000 keymaps in 2.300165s

after:
total heap usage: 4,647,536 allocs, 4,647,536 frees, 538,021,213 bytes allocated
compiled 1000 keymaps in 1.949309s

Signed-off-by: Ran Benita <ran@unusedvar.com>
Signed-off-by: Ran Benita <ran@unusedvar.com>
total heap usage: 1,483,536 allocs, 1,483,536 frees, 512,020,213 bytes allocated

Signed-off-by: Ran Benita <ran@unusedvar.com>
@bluetech
Copy link
Member Author

Also converting the scanner strdups bring the allocs down to:

total heap usage: 1,483,536 allocs, 1,483,536 frees, 512,020,213 bytes allocated

but the runtime isn't reduced by much. At this point memory allocation is not a significant component of the runtime, it's mostly the parser+scanner, compilation logic, atom interning, keysym lookup, etc.

Signed-off-by: Ran Benita <ran@unusedvar.com>
Signed-off-by: Ran Benita <ran@unusedvar.com>
Signed-off-by: Ran Benita <ran@unusedvar.com>
Signed-off-by: Ran Benita <ran@unusedvar.com>
@wismill wismill added compile-keymap Indicates a need for improvements or additions to keymap compilation performance labels Jan 29, 2025
@bluetech bluetech closed this Jan 31, 2025
@bluetech bluetech deleted the bump branch January 31, 2025 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

compile-keymap Indicates a need for improvements or additions to keymap compilation performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants