Skip to content

Mekhanik evgenii/fix 1346 1#2456

Open
EvgeniiMekhanik wants to merge 20 commits intomasterfrom
MekhanikEvgenii/fix-1346-1
Open

Mekhanik evgenii/fix 1346 1#2456
EvgeniiMekhanik wants to merge 20 commits intomasterfrom
MekhanikEvgenii/fix-1346-1

Conversation

@EvgeniiMekhanik
Copy link
Copy Markdown
Contributor

No description provided.

@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/fix-1346-1 branch from a62787c to c9bba36 Compare July 2, 2025 09:46
fw/ss_skb.c Outdated
__u8 pfmemalloc = skb->pfmemalloc;

WARN_ON_ONCE(skb->sk);
skb_orphan(skb);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please pay attention on this place. Here we release skb owner and decrease client->mem. This function ss_skb_init_for_xmit is called before push skb to the socket write queue. So all skbs in socket write queue are not taken into account for client memory calculation. We release skb owner here, because if don't do it we need a rather big kernel patch to adjust skb memory before it will be passed to socket write queue. @krizhanovsky @const-t what do you think about it?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we make a pointer to a client accounting in skb->cb instead of to play with skb_orphan()? I'd prefer to avoid this since we can get plenty of crashes in this patch or in later kernel version migrations due to breaking kernel logic about orphaned skbs.

@krizhanovsky krizhanovsky mentioned this pull request Jul 7, 2025
2 tasks
@EvgeniiMekhanik EvgeniiMekhanik marked this pull request as draft July 8, 2025 09:12
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/fix-1346-1 branch from c9bba36 to 26e3525 Compare July 8, 2025 09:12
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/fix-1346-1 branch 16 times, most recently from 40654f8 to e2de424 Compare July 11, 2025 14:09
@EvgeniiMekhanik EvgeniiMekhanik marked this pull request as ready for review July 11, 2025 14:09
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/fix-1346-1 branch 5 times, most recently from 7b5e367 to ac06de7 Compare July 14, 2025 21:12
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/fix-1346-1 branch from 92911ea to 1975ddc Compare January 14, 2026 15:03
.allow_reconfig = true,
},
{
.name = "client_mem",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EvgeniiMekhanik @krizhanovsky I should notice that currently client_mem overrides frang's http_body_len. By default Tempesta can receive 1GB request/response, but with enabled client_mem it is limited to client_mem. It may be surprising during configuration. It seems we can temporary disable http_body_len and set client_mem to 1GB until #498 is implemented.

*/
typedef struct {
TfwPoolChunk *curr;
void *owner;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not forward-declare TfwClient struct?

@EvgeniiMekhanik
Copy link
Copy Markdown
Contributor Author

2e94835:
finished in 40.05s, 1754757.73 req/s, 1.33GB/s - no client mem option, just accounting
finished in 40.06s, 1698862.07 req/s, 1.29GB/s - client mem set, check memory consumption
d7783ad:
finished in 40.06s, 1351866.00 req/s, 1.02GB/s - no client mem option, just accounting
finished in 40.06s, 1297204.55 req/s, 1004.79MB/s - client mem set, check memory consumption
7dd07ec:
finished in 40.05s, 1544324.25 req/s, 1.17GB/s - no client mem option, just accounting
finished in 40.06s, 1479000.85 req/s, 1.12GB/s - client mem set, check memory consumption
no 4037885:
finished in 40.05s, 1568655.85 req/s, 1.19GB/s  - no client mem option, just accounting
finished in 40.05s, 1568655.85 req/s, 1.16GB/s - client mem set, check memory consumption
master:
finished in 40.05s, 1802978.02 req/s, 1.36GB/s

@EvgeniiMekhanik
Copy link
Copy Markdown
Contributor Author

Big responses
master:
finished in 40.06s, 64391.32 req/s, 6.12GB/s
2e94835:
finished in 40.10s, 63755.38 req/s, 6.11GB/s - no client mem option, just accounting

@EvgeniiMekhanik
Copy link
Copy Markdown
Contributor Author

Also pay attention that we call frang_client_mem_limit on in a few places, not on each memory allocation to prevent performance degradation

fw/client.c Outdated
tfw_client_free(TdbRec *rec)
{
/* Stats should be updated(dec/inc) only for complete records. */
if (!tdb_entry_is_complete(rec))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we don't need this condition for clients, we allocate per-cpu memory even for not complete records. At this moment we are safe, however if in future we fail before set record as complete we will have memleak.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes fixed

fw/client.c Outdated

assert_spin_locked(&client_db->ga_lock);

cli->mem = tfw_alloc_percpu(long);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alloc_percpu() has GFP_KERNEL flag by default, in this place we must set GFP_ATOMIC | __GFP_ZERO and remove

for_each_online_cpu(cpu)
		*(per_cpu_ptr(cli->mem, cpu)) = 0;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes fixed

parsed, skb->len);
}

r = frang_client_mem_limit((TfwCliConn *)c, true);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worrying about this place, without benchmark I can't say how it is crucial, but seems we introduce a lot of inter-cache traffic, that might be significant on NUMA. Even for small frames, service frames, we call frang_client_mem_limit(). Maybe we can have a local per-cpu threshold(for instance 4 * default mtu) to check during request processing and to check the remote cpus only when threshold has been reached and then once per request or client processing stage - when all skbs are processed? However the last makes sense only if we close all connections from the client.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this only one place, when we decrease performance for 10 percent (if appropriate option is enabled, and we check client memory consumption). @krizhanovsky is it crusial? Should we accuracy check memory consumption and block client or not?

r = -ENOMEM;
goto out;
}
ss_skb_set_owner(skb, ss_skb_dflt_destructor,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a few places like that, we do ss_skb_set_owner() where we assign new value to per-cpu. And then ss_skb_adjust_data_len() where we also assign new value to per-cpu and we do this in the loop. By itself assign new value to per-cpu in the loop is very cheap, it is regular assign. However, we have remote access to this per-cpu variable that leads to inter-cache traffic even for local writes. Maybe we can assign values to "local" variable and after some threshold assign it to per-cpu? Not only in this place, do like so everywhere.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I see on SUT, we have no problem with assigning the only one real place which decrease performance (in case when it is enabled) is frang_client_mem_limit, where we sum client memory for all cpus

skb_fill_page_desc(it->skb, it->frag, page, off, sz);
if (!h2)
skb_frag_ref(it->skb, it->frag);
ss_skb_adjust_data_len(it->skb, sz);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we add client mem for plain http1 as well as other protocols, however for plain http1 memory will not be allocated and ss_skb_to_sgvec_with_new_pages()will not be called. I'm not ask to change this behavior it is up to you, but I ask to write a comment to describe this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes fixed

conn->peer, skb->truesize);
}

r = frang_client_mem_limit((TfwCliConn *)conn, false);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: We checking limit of memory, before parsing, cache response build, etc. Those may consume a lot of memory, that can lead to interesting side effect: the current response will live during transfer to the client consuming a lot of memory and only new requests will not received from the same client, even if they are not so big. However checking mem limit in the end of req processing even worse.

fw/sock.c Outdated
memset(twin_skb->cb, 0, sizeof(twin_skb->cb));
ss_skb_set_owner(twin_skb, ss_skb_dflt_destructor,
TFW_SKB_CB(skb)->opaque_data,
skb_headlen(skb));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
skb_headlen(skb));
skb_headlen(twin_skb));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we don't account struct sk_buff and struct skb_shared_info in this place, but in other places we do, why?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes fixed

client_mem <soft_limit> <hard_limit> - controls haw many
memory is used to store unanswered client requests and
requests with linked responses which can not be forwarded
to a client.
To track socket memory we should pass TfwHttpMsg * not
TfwMsgIter * to most of http_nsg_* functions, because
TfwHttpMsg has a pointer to connection and socket.
In task #498 we decide to use `client_mem` option to limit
count of memory used by client. This commit is a part of
this task - now Tempesta FW uses `sk->sk_rmem_alloc` to
adjust memory used by Tempesta FW for this client connection.
In task #498 we decide to use `client_mem` option to limit
count of memory used by client. This commit is a part of
this task and the next step of implementaion. Previosly
Tempesta FW uses `sk->sk_rmem_alloc` to adjust memory
used by Tempesta FW for this client connection, now we
adjust memory for the whole TfwClient, because the can
be a lot of connection for one client and for all other
cases we use limitation for TfwClient and block it if
necessary.
If administrator specify `client_mem` and the memory
used by all connection of current client exceeded this
value Tempesta FW drops connection and block client
by ip if `ip_block on;` is specified.
Previosuly we get connection when we adjust memory
for skb, but it leads to several problems:
- we can't adjust memory for skb before tls decryption,
because skb from `tls->io_in.skb_list` are freed
during connection released (but connection will be
never released if we increment it's reference counter
for these skbs).
- We have the same problems for skbs, which are wait
for appropriate tcp window to be pushed in socket
write queue.
Now we increment/decrement reference counter for
TfwClient and adjust skb memory for requests before
tls decryption.
Previously we adjust tcp send window only for http2
connection and only during making HEADER or DATA frames,
but if we want to control client memory usage we should
do it for all type of sending data. (We orphane skb and
decrease memory usage when we pass skb to the socket
write queue, so we we don't adjust tcp send window we
push a lot of skbs in socket write queue and don't
adjust it's memory).
- remove `client_get_light/client_put_light` functions,
  because after removing lock from `client` structure
  we don't need these functions at all.
- Adjust memory usage of skb in `skb->cb`. Usually it
  is equal to `skb->truesize, but for some cases (
  skb which was created by `pskb_copy_for_clone` for
  example it is different).
- use `skb_shift` instead of `skb_try_coalesce` to
  correctly adjust send window during entailing skb
  to socket write queue.
- adjust FRAME_HEADER_SIZE during calculation send window
  during making frames. (There was a mistake with accuracy
  of send window calculation, we don't take into account,
  that each frame also contains frame header).
- change BUG_ON to WARN_ON.
- rename tfw_cli_*_limit to tfw_cli_*_mem_limit
- rename `ss_skb_can_collapce` to `ss_skb_can_collapse`
- rename `tfw_h2_or_stream_wnd_is_exceeded` to
  `tfw_h2_conn_or_stream_wnd_is_exceeded`.
- move braces `{` to the next line.
- rename `ss_skb_adjust_sk_mem` to `ss_skb_adjust_client_mem`.
- Do not dublicate code for http1 and http2 in
  `tfw_connection_push`.
- Change BUG_ON to WARN_ON in some places.
Do not use `skb->sk` and `skb->destructor` to check
memory used by skb, use `skb->cb` for this purposes.
- Implement our own version of `skb_orphan` with
name `ss_skb_orphan` which is called when skb is freed
in Tempesta FW code our just before pushing skb to
socket write queue.
- Implement wrappers over `__kfree_skb` and `kfree_skb`
  where we call `ss_skb_orphan` before free skb.
- Check that skb is pushed to socket write queue, using
  new ipmlemented function `skb_tfw_is_in_socket_write_queue`
  from linux kernel, to skip adjusting memory used be skb,
  when it belongs to kernel (when `ss_skb_*` functions
  called from `tls_encrypt`).
- Usually we use callbacks which are set in `skb->cb`
  for different purposes. So remove to callbacks, which
  was added in previous patches and use callbacks saved
  in `skb->cb`.
- Since we use pool for http memory allocation, change
  api of all `tfw_pool_*` functions to pass `TfwClient`
  and accounting memory in this structure.
- Remove `TfwClient` refcounter (it not used, can be done
  in previous commits).
- Fix unit tests to check memory accounting, cleanup memory
  after each test, to check that client memory is equal to
  zero after test.
A big performance degradation was found after this patch.
During investigation it was found that the problem is in
usage atomic counter for client mem accounting. Usage per_cpu
array instead of atomic counter fix a performance issue.
During investigation of performance degradation was found
that we loose about 5 - 10 % of performance when we use
`skb_shift` and adjust send window accurately during
entailing skb in socket write queue. Revert this change.
Also call `ss_skb_orphan` if we merge entailed skb to the
tail skb in socket write queue.
Previously we remove client entry from TDB if there
is no entry in `client_lru.free_list` and new client
is allocated, even if such removed client still have
any active connections. There is a BUG in such strategy -
if this removed client has hung connections, we can't
close and destroy them during Tempesta FW unloading,
because we close and destroy connections during iteration
through active clients (`tfw_client_for_each`).
In new strategy we change logic in `tdb_htrie_put_rec`.
We add pointer to the bucket in the record structure. When we
remove record we zeroed this pointer. If record reference counter
became equal to zero, but bucket pointer is still not NULL (record
was not removed) we remove such record from the bucket using
this pointer. For clients we just use tfw_client_put, without
record removing, when  client reference counter became equal
to zero client record will be removed from bucket and freed.
fw/cache.c Outdated
if (!h2)
skb_frag_ref(it->skb, it->frag);
ss_skb_adjust_data_len(it->skb, sz);
else
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why else if we add fragment for both h2 and h1?

We can't call tfw_client_get/put on each allocated
or orphaned skb. (Or each pool creation/destroing).
Under pressure when we have a lot of cpus that lead
to atomic contention and bad performance degradation.
To fix this problem we implement special TfwClientMem
structure, with it's own reference accounting (using
struct percpu_ref!) and save in the client structure
point to it. We use percpu_ref_tryget/percpu_ref_put
during skb allocation/deallocation (it's very cheap).
When we destroy client we schedule work, call
`percpu_ref_kill_and_confirm` and wait until all skbs
will be orphaned.
Also make some fixes according review:
- Call `tfw_client_free` for incomplete records also.
- Implement `tfw_alloc_percpu_gfp` same as `alloc_percpu_gfp`
  but with error injection
- Fix memory accouting during copying skbs.
{
TfwClientMem *cli_mem;

cli_mem = tfw_kmalloc(sizeof(TfwClientMem), GFP_ATOMIC);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to re-use this memory, good for memcache, but as I see, we can't use it, because during shutdown we may destroy the cache before freeing all objects.

Also this is untested place. We didn't do benchmarks with many clients so we can't say what overhead it introduces

{
TfwClientMem *cli_mem = container_of(ref, TfwClientMem, refcnt);

call_rcu(&cli_mem->rcu_head, __cli_mem_release);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning on shutdown: Start Tempesta do single request, then stop Tempesta.

[  183.855417] [tempesta fw] Tempesta FW is ready
[  200.349624] [tdb] Close table 'client0.tdb'
[  200.353864] [tdb] Close table 'sessions0.tdb'
[  200.391901] ------------[ cut here ]------------
[  200.392530] WARNING: CPU: 0 PID: 0 at kernel/rcu/tree.c:2631 rcu_core+0x3d9/0x7c0
[  200.393491] Modules linked in: tempesta_fw(OE) tempesta_db(OE) tempesta_tls(OE) tempesta_lib(OE) veth intel_rapl_msr intel_rapl_common xt_conntrack xt_MASQUERADE nft_masq xfrm_user xfrm_algo nft_chain_nat nf_nat snd_hda_codec_generic nf_conntrack snd_hda_intel xt_addrtype nf_defrag_ipv6 nft_compat nf_defrag_ipv4 snd_intel_dspcfg bridge snd_intel_sdw_acpi snd_hda_codec stp llc snd_hda_core nf_tables overlay qxl snd_hwdep kvm_amd snd_pcm i2c_i801 drm_ttm_helper snd_timer ccp cfg80211 ttm i2c_mux snd binfmt_misc kvm joydev input_leds i2c_smbus lpc_ich drm_kms_helper soundcore virtiofs mac_hid serio_raw sch_fq_codel dm_multipath efi_pstore drm nfnetlink dmi_sysfs qemu_fw_cfg ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 crct10dif_pclmul hid_generic crc32_pclmul ghash_clmulni_intel virtio_net usbhid ahci sha512_ssse3 net_failover psmouse virtio_scsi sha256_ssse3 virtio_rng libahci failover hid virtio_blk
[  200.393538]  aesni_intel
[  200.397304] [tdb] Close table 'cache0.tdb'
[  200.413234]  crypto_simd cryptd [last unloaded: tempesta_lib(OE)]
[  200.413249] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G           OE      6.12.12+ #225
[  200.413257] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[  200.413257] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.17.0-2-2 04/01/2014
[  200.413259] RIP: 0010:rcu_core+0x3d9/0x7c0
[  200.413264] Code: 02 00 00 80 7c 24 2f 00 0f 85 51 03 00 00 48 85 c0 0f 84 c4 02 00 00 84 d2 74 11 48 8b 7c 24 08 e8 ec 33 00 00 48 85 c0 75 02 <0f> 0b 41 f7 c6 00 02 00 00 74 05 e8 97 99 ff ff 48 8b 7c 24 08 e8
[  200.413270] RSP: 0018:ffffc1b080005f10 EFLAGS: 00010046
[  200.413277] RAX: 0000000000000000 RBX: ffff9c2a182373b8 RCX: 00000000802a0028
[  200.413278] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff9c2a182373b8
[  200.413279] RBP: ffff9c2a18237340 R08: ffff9c26c44f3de0 R09: 00000000802a0028
[  200.413279] R10: 00000000802a0028 R11: ffffffffbd74a628 R12: ffffffffbd60a940
[  200.413280] R13: 00000000000355c8 R14: 0000000000000246 R15: ffffffffffffffff
[  200.413283] FS:  0000000000000000(0000) GS:ffff9c2a18200000(0000) knlGS:0000000000000000
[  200.413285] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  200.413286] CR2: 000055aea04216a8 CR3: 0000000100a96000 CR4: 0000000000750ef0
[  200.413289] PKRU: 55555554
[  200.413291] Call Trace:
[  200.413305]  <IRQ>
[  200.413309]  ? __warn+0x89/0x140
[  200.413320]  ? rcu_core+0x3d9/0x7c0
[  200.413322]  ? report_bug+0x164/0x1a0
[  200.413344]  ? handle_bug+0x58/0xa0
[  200.413361]  ? exc_invalid_op+0x17/0x80
[  200.413363]  ? asm_exc_invalid_op+0x1a/0x20
[  200.413378]  ? rcu_core+0x3d9/0x7c0
[  200.413380]  ? rcu_core+0x269/0x7c0
[  200.413381]  handle_softirqs+0xd9/0x2e0
[  200.413400]  __irq_exit_rcu+0x63/0x80
[  200.413404]  sysvec_apic_timer_interrupt+0x71/0xa0
[  200.413421]  </IRQ>
[  200.413423]  <TASK>
[  200.413423]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  200.413436] RIP: 0010:pv_native_safe_halt+0xf/0x20
[  200.413440] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d 55 f5 39 00 fb f4 <c3> cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90
[  200.413440] RSP: 0018:ffffffffbd603e88 EFLAGS: 00000206
[  200.413442] RAX: ffff9c2a18200000 RBX: ffffffffbd60a940 RCX: 0000000000000000
[  200.413443] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000012dd64
[  200.413443] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000002
[  200.413443] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[  200.413444] R13: 0000000000000000 R14: ffffffffbd60a040 R15: 000000000008a000
[  200.413446]  ? ct_kernel_exit.constprop.0+0x5d/0x80
[  200.413448]  default_idle+0x9/0x20
[  200.413449]  default_idle_call+0x30/0x100
[  200.413451]  do_idle+0x1fb/0x240
[  200.413473]  cpu_startup_entry+0x29/0x40
[  200.413476]  rest_init+0xcc/0xe0
[  200.413477]  start_kernel+0x61b/0x8a0
[  200.413533]  x86_64_start_reservations+0x18/0x40
[  200.413560]  x86_64_start_kernel+0x7a/0x80
[  200.413563]  common_startup_64+0x13e/0x141
[  200.413584]  </TASK>
[  200.413586] ---[ end trace 0000000000000000 ]---
[  200.492377] ------------[ cut here ]------------
[  200.493101] WARNING: CPU: 0 PID: 1419 at kernel/rcu/tree.c:2628 rcu_core+0x70e/0x7c0
[  200.494063] Modules linked in: tempesta_fw(OE) tempesta_db(OE) tempesta_tls(OE) tempesta_lib(OE) veth intel_rapl_msr intel_rapl_common xt_conntrack xt_MASQUERADE nft_masq xfrm_user xfrm_algo nft_chain_nat nf_nat snd_hda_codec_generic nf_conntrack snd_hda_intel xt_addrtype nf_defrag_ipv6 nft_compat nf_defrag_ipv4 snd_intel_dspcfg bridge snd_intel_sdw_acpi snd_hda_codec stp llc snd_hda_core nf_tables overlay qxl snd_hwdep kvm_amd snd_pcm i2c_i801 drm_ttm_helper snd_timer ccp cfg80211 ttm i2c_mux snd binfmt_misc kvm joydev input_leds i2c_smbus lpc_ich drm_kms_helper soundcore virtiofs mac_hid serio_raw sch_fq_codel dm_multipath efi_pstore drm nfnetlink dmi_sysfs qemu_fw_cfg ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 crct10dif_pclmul hid_generic crc32_pclmul ghash_clmulni_intel virtio_net usbhid ahci sha512_ssse3 net_failover psmouse virtio_scsi sha256_ssse3 virtio_rng libahci failover hid virtio_blk
[  200.494102]  aesni_intel crypto_simd cryptd [last unloaded: tempesta_lib(OE)]
[  200.506042] CPU: 0 UID: 0 PID: 1419 Comm: kworker/0:9 Tainted: G        W  OE      6.12.12+ #225
[  200.507170] Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[  200.508032] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.17.0-2-2 04/01/2014
[  200.509228] Workqueue: dio/vda2 iomap_dio_complete_work
[  200.509976] RIP: 0010:rcu_core+0x70e/0x7c0
[  200.510571] Code: 8b 15 f6 98 b8 01 48 f7 da 48 85 d2 7e 33 48 8b 55 78 48 85 d2 0f 84 aa fc ff ff eb 31 48 8b 45 78 48 85 c0 0f 85 bc fc ff ff <0f> 0b e9 c6 fc ff ff 48 89 ce 4c 89 ff e8 a0 a4 eb 00 e9 d3 f9 ff
[  200.513008] RSP: 0018:ffffc1b080005f10 EFLAGS: 00010046
[  200.513732] RAX: 0000000000000000 RBX: ffff9c2a182373b8 RCX: ffff9c2a18237460
[  200.514667] RDX: ffffffffffffd8f0 RSI: 0000000000000001 RDI: ffff9c2a182373b8
[  200.515746] RBP: ffff9c2a18237340 R08: 0000000000000001 R09: 7fffffffffffffff
[  200.516730] R10: ffffffffbd6060c0 R11: 00000000000ecef5 R12: ffff9c26c4264000
[  200.517710] R13: 00000000000355c8 R14: 0000000000000246 R15: ffffffffffffffff
[  200.518768] FS:  0000000000000000(0000) GS:ffff9c2a18200000(0000) knlGS:0000000000000000
[  200.519894] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  200.520685] CR2: 000055aea04216a8 CR3: 0000000101ff2000 CR4: 0000000000750ef0
[  200.521624] PKRU: 55555554
[  200.522073] Call Trace:
[  200.522551]  <IRQ>
[  200.522977]  ? __warn+0x89/0x140
[  200.523538]  ? rcu_core+0x70e/0x7c0
[  200.524075]  ? report_bug+0x164/0x1a0
[  200.524634]  ? handle_bug+0x58/0xa0
[  200.525168]  ? exc_invalid_op+0x17/0x80
[  200.525742]  ? asm_exc_invalid_op+0x1a/0x20
[  200.526347]  ? rcu_core+0x70e/0x7c0
[  200.526883]  ? rcu_core+0x269/0x7c0
[  200.527509]  ? __hrtimer_run_queues+0x141/0x2a0
[  200.528166]  handle_softirqs+0xd9/0x2e0
[  200.528779]  __irq_exit_rcu+0x63/0x80
[  200.529383]  sysvec_apic_timer_interrupt+0x71/0xa0
[  200.530061]  </IRQ>
[  200.530440]  <TASK>
[  200.530827]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  200.531580] RIP: 0010:_raw_spin_unlock_irqrestore+0x1d/0x40
[  200.532338] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 e8 b2 0a 00 00 90 f7 c6 00 02 00 00 74 06 fb 0f 1f 44 00 00 <65> ff 0d a4 81 5b 43 74 05 c3 cc cc cc cc 0f 1f 44 00 00 c3 cc cc
[  200.534759] RSP: 0018:ffffc1b082567e58 EFLAGS: 00000206
[  200.535534] RAX: 0000000000000001 RBX: ffff9c26c0ce9f00 RCX: ffff9c26c867d368
[  200.536465] RDX: 0000000000000000 RSI: 0000000000000287 RDI: ffff9c26c867d360
[  200.537508] RBP: ffff9c26c3888800 R08: 0000000000000001 R09: 0000000000000002
[  200.538542] R10: 0000000000000000 R11: 0000000000000002 R12: 0000000000004000
[  200.539603] R13: ffff9c26c4264000 R14: ffff9c270dd39160 R15: 0000000000000000
[  200.540532]  ? _raw_spin_unlock_irqrestore+0xe/0x40
[  200.541209]  aio_complete_rw+0xdb/0x1c0
[  200.541787]  process_one_work+0x16d/0x380
[  200.542380]  worker_thread+0x2cb/0x3e0
[  200.543032]  ? __pfx_worker_thread+0x20/0x20
[  200.543689]  kthread+0xcf/0x100
[  200.544178]  ? __pfx_kthread+0x20/0x20
[  200.544720]  ret_from_fork+0x31/0x60
[  200.545246]  ? __pfx_kthread+0x20/0x20
[  200.545780]  ret_from_fork_asm+0x22/0x60
[  200.546345]  </TASK>
[  200.546707] ---[ end trace 0000000000000000 ]---
[  200.565583] [tdb] Close table 'filter0.tdb'
[  200.566166] [tempesta fw] modules are stopped
[  200.693391] [tempesta fw] exiting...
[  200.805140] [tdb] Shutdown Tempesta DB

}

static void
cli_mem_release(struct percpu_ref *ref)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this function will be called twice, the first time from percpu_ref_call_confirm_rcu() when per-cpu will be switched to atomic calling percpu_ref_kill_and_confirm() and the second time when refcounter reached zero in percpu_ref_put_many(). Please check it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants