Skip to content

Implement aarch64_get_thread_helper() for Windows AArch64#67

Draft
raneashay wants to merge 1 commit intomicrosoft:mainfrom
raneashay:ashay/aarch64_get_thread_helper-for-windows
Draft

Implement aarch64_get_thread_helper() for Windows AArch64#67
raneashay wants to merge 1 commit intomicrosoft:mainfrom
raneashay:ashay/aarch64_get_thread_helper-for-windows

Conversation

@raneashay
Copy link
Copy Markdown

Prior to this patch, on Windows/AArch64, aarch64_get_thread_helper()
resolves to Thread::current(), which, although correct, is suboptimal,
since it requires all registers from r0 through r17 to be saved by
the caller. Linux/AArch64 has a hand-written assembly implementation
that clobbers just x0 and x1, making it more efficient. This patch
implements a similar hand-written assembly implementation for Windows.

Unlike Linux, which stores the thread identifying information pointer in
the tpidr_el0 register, Windows stores the pointer to the Thread
Environment Block in x18, so this assembly code basically reads the
*[*(x18 + 0x58) + _tls_index * 8] + _jvm_thr_current_tls_offset
address, where 0x58 is the offset into the TEB to get to the thread
local storage pointer (see
https://www.geoffchappell.com/studies/windows/km/ntoskrnl/inc/api/pebteb/teb/index.htm),
and _jvm_thr_current_tls_offset is computed in
cache_global_variables() once using Thread::_thr_current and is then
reused in each call to the assembly code.

Prior to this patch, on Windows/AArch64, `aarch64_get_thread_helper()`
resolves to `Thread::current()`, which, although correct, is suboptimal,
since it requires all registers from `r0` through `r17` to be saved by
the caller.  Linux/AArch64 has a hand-written assembly implementation
that clobbers just `x0` and `x1`, making it more efficient.  This patch
implements a similar hand-written assembly implementation for Windows.

Unlike Linux, which stores the thread identifying information pointer in
the `tpidr_el0` register, Windows stores the pointer to the Thread
Environment Block in `x18`, so this assembly code basically reads the
`*[*(x18 + 0x58) + _tls_index * 8] + _jvm_thr_current_tls_offset`
address, where `0x58` is the offset into the TEB to get to the thread
local storage pointer (see
https://www.geoffchappell.com/studies/windows/km/ntoskrnl/inc/api/pebteb/teb/index.htm),
and `_jvm_thr_current_tls_offset` is computed in
`cache_global_variables()` once using `Thread::_thr_current` and is then
reused in each call to the assembly code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant