-
Notifications
You must be signed in to change notification settings - Fork 152
xsk: introduce pre-allocated memory per xsk CQ #10482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: bpf-next_base
Are you sure you want to change the base?
Conversation
|
Upstream branch: 6f0b824 |
AI reviewed your patch. Please fix the bug or email reply why it's not a bug. In-Reply-To-Subject: |
|
Forwarding comment 3658942663 via email |
162c0b3 to
305a67f
Compare
|
Upstream branch: e7a0adb |
1638705 to
1721745
Compare
305a67f to
7f3448b
Compare
|
Upstream branch: ec439c3 |
1721745 to
53834da
Compare
7f3448b to
0de3956
Compare
|
Upstream branch: ec439c3 |
53834da to
aa914b7
Compare
0de3956 to
2d78e4d
Compare
|
Upstream branch: 3d60306 |
aa914b7 to
93e9e2e
Compare
2d78e4d to
5007636
Compare
|
Upstream branch: d2749ae |
93e9e2e to
fc168d1
Compare
5007636 to
0ac68c9
Compare
This is a prep that will be used to store the addr(s) of descriptors so that each skb going to the end of life can publish corresponding addr(s) in its completion queue that can be read by userspace. Signed-off-by: Jason Xing <kernelxing@tencent.com>
Before the commit 30f241f ("xsk: Fix immature cq descriptor production"), there is one issue[1] which causes the wrong publish of descriptors in race condidtion. The above commit fixes the issue but adds more memory operations in the xmit hot path and interrupt context, which can cause side effect in performance. Based on the existing infrastructure, this patch tries to propose a new solution to fix the problem by using a pre-allocated memory that is local completion queue to avoid frequently performing memory functions. The benefit comes from replacing xsk_tx_generic_cache with local cq. The core logics are as show below: 1. allocate a new local completion queue when setting the real queue. 2. write the descriptors into the local cq in the xmit path. And record the prod as @start_pos that reflects the start position of skb in this queue so that later the skb can easily write the desc addr(s) from local cq to cq addrs in the destruction phase. 3. initialize the upper 24 bits of destructor_arg to store @start_pos in xsk_skb_init_misc(). 4. Initialize the lower 8 bits of destructor_arg to store how many descriptors the skb owns in xsk_inc_num_desc(). 5. write the desc addr(s) from the @start_addr from the local cq one by one into the real cq in xsk_destruct_skb(). In turn sync the global state of the cq as before. The format of destructor_arg is designed as: ------------------------ -------- | start_pos | num | ------------------------ -------- Using upper 24 bits is enough to keep the temporary descriptors. And it's also enough to use lower 8 bits to show the number of descriptors that one skb owns. [1]: https://lore.kernel.org/all/20250530095957.43248-1-e.kubanski@partner.samsung.com/ Signed-off-by: Jason Xing <kernelxing@tencent.com>
|
Upstream branch: f785a31 |
fc168d1 to
5482254
Compare
Pull request for series with
subject: xsk: introduce pre-allocated memory per xsk CQ
version: 2
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1033607