Skip to content

Conversation

@opsiff
Copy link
Member

@opsiff opsiff commented Jan 7, 2026

PR sync from: Peng Zhang zhangpeng362@huawei.com
https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/5H6WRELSEVUK7Z2YFHIMWEVCUHD33QFS/
From: ZhangPeng zhangpeng362@huawei.com

Since commit f1a7941 ("mm: convert mm's rss stats into
percpu_counter"), the rss_stats have converted into percpu_counter,
which convert the error margin from (nr_threads * 64) to approximately
(nr_cpus ^ 2). However, the new percpu allocation in mm_init() causes a
performance regression on fork/exec/shell. Even after commit 14ef95b
("kernel/fork: group allocation/free of per-cpu counters for mm struct"),
the performance of fork/exec/shell is still poor compared to previous
kernel versions.

To mitigate performance regression, we delay the allocation of percpu
memory for rss_stats. Therefore, we convert mm's rss stats to use
regression caused by using percpu. For multiple-thread processes,
second thread is created.

After lmbench test, we can get 2% ~ 4% performance improvement
for lmbench fork_proc/exec_proc/shell_proc and 6.7% performance

The test results are as follows:
base base+revert base+this patch

fork_proc 416.3ms 400.0ms (3.9%) 398.6ms (4.2%)
exec_proc 2095.9ms 2061.1ms (1.7%) 2047.7ms (2.3%)
shell_proc 3028.2ms 2954.7ms (2.4%) 2961.2ms (2.2%)
page_fault 0.3603ms 0.3358ms (6.8%) 0.3361ms (6.7%)

[1] https://lore.kernel.org/all/20240412064751.119015-1-wangkefeng.wang@huawei.com/

ChangeLog:

v2->v3:

remove patch 3.
v1->v2:
Split patch 2 into two patches.
ZhangPeng (2):

--
2.25.1

Summary by Sourcery

Convert mm RSS accounting to use atomics by default and switch to percpu counters only for multithreaded processes to reduce fork/exec overhead while preserving accuracy where needed.

New Features:

  • Introduce dual-mode percpu counters that support both atomic and percpu-backed operation selected at runtime.
  • Add helpers for mm RSS counters to read, update, and sum values transparently across atomic and percpu modes, including tracepoint support.

Enhancements:

  • Delay allocation of percpu memory for mm RSS stats until a process becomes multithreaded to reduce memory usage and process creation latency.
  • Provide an API to switch existing counters from atomic mode to percpu mode and to cleanly destroy mm RSS counters.

@sourcery-ai
Copy link

sourcery-ai bot commented Jan 7, 2026

Reviewer's Guide

This PR changes mm’s rss accounting to use a lightweight atomic mode by default and only switch to percpu counters when a process becomes multithreaded, reducing fork/exec overhead while preserving accurate accounting for multi-threaded workloads.

Sequence diagram for switching rss_stat from atomic mode to percpu mode on thread creation

sequenceDiagram
    actor Task
    participant KernelClone
    participant OldMM
    participant PerCpuCounter

    Task->>KernelClone: clone(clone_flags)
    KernelClone->>KernelClone: copy_mm(clone_flags, tsk)
    KernelClone->>OldMM: obtain oldmm
    alt clone_flags has CLONE_THREAD
        KernelClone->>OldMM: mm_counter_switch_to_pcpu()
        OldMM->>PerCpuCounter: percpu_counter_switch_to_pcpu_many(rss_stat, NR_MM_COUNTERS)
        alt rss_stat not initialized
            PerCpuCounter->>PerCpuCounter: __percpu_counter_init_many(..., switch_mode=true)
            PerCpuCounter-->>OldMM: allocate percpu memory, keep counts in count_atomic
        else rss_stat already percpu
            PerCpuCounter-->>OldMM: return 0 (no-op)
        end
        OldMM-->>KernelClone: success or -ENOMEM
        alt init failed
            KernelClone-->>Task: clone fails with -ENOMEM
        else init succeeded
            KernelClone->>KernelClone: proceed with CLONE_THREAD handling
            KernelClone-->>Task: new thread created sharing mm
        end
    else no CLONE_THREAD
        KernelClone->>KernelClone: keep rss_stat in atomic mode
        KernelClone-->>Task: process or thread without shared mm
    end
Loading

Sequence diagram for rss_stat updates using atomic mode or percpu mode

sequenceDiagram
    actor Task
    participant MM
    participant PerCpuCounter

    Note over Task,MM: Update path (e.g. page fault)
    Task->>MM: add_mm_counter(member, value)
    MM->>MM: fbc = &rss_stat[member]
    alt percpu_counter_initialized(fbc)
        MM->>PerCpuCounter: percpu_counter_add(fbc, value)
    else atomic mode
        MM->>PerCpuCounter: percpu_counter_atomic_add(fbc, value)
    end
    MM->>MM: mm_trace_rss_stat(mm, member)

    Note over Task,MM: Read path (e.g. stats or tracing)
    Task->>MM: mm_counter_sum(member)
    MM->>MM: fbc = &rss_stat[member]
    alt percpu_counter_initialized(fbc)
        MM->>PerCpuCounter: percpu_counter_sum(fbc)
        PerCpuCounter-->>MM: aggregated percpu value
    else atomic mode
        MM->>PerCpuCounter: percpu_counter_atomic_read(fbc)
        PerCpuCounter-->>MM: atomic count
    end
    MM-->>Task: rss value
Loading

Class diagram for percpu_counter atomic/percpu modes and mm rss_stat helpers

classDiagram
    class percpu_counter {
        +raw_spinlock_t lock
        +s64 count
        +atomic64_t count_atomic
        +void __percpu* counters
        +bool percpu_counter_initialized()
        +s64 percpu_counter_atomic_read()
        +void percpu_counter_atomic_add(s64 amount)
        +int percpu_counter_switch_to_pcpu_many(u32 nr_counters)
    }

    class mm_struct {
        +percpu_counter rss_stat[NR_MM_COUNTERS]
        +unsigned long get_mm_counter(int member)
        +unsigned long get_mm_counter_sum(int member)
        +void add_mm_counter(int member, long value)
        +void inc_mm_counter(int member)
        +void dec_mm_counter(int member)
        +s64 mm_counter_sum(int member)
        +s64 mm_counter_sum_positive(int member)
        +int mm_counter_switch_to_pcpu()
        +void mm_counter_destroy()
    }

    class fork_copy_mm_path {
        +int copy_mm(unsigned long clone_flags, struct task_struct* tsk)
    }

    percpu_counter <.. mm_struct : used_by
    mm_struct "NR_MM_COUNTERS" o-- percpu_counter : rss_stat
    mm_struct <.. fork_copy_mm_path : operates_on
Loading

File-Level Changes

Change Details Files
Extend percpu_counter to support an atomic mode alongside percpu mode and provide helpers to switch to percpu lazily.
  • Modify struct percpu_counter to hold either a plain s64 or an atomic64_t counter via a union on SMP, and reuse the scalar on !SMP
  • Extend __percpu_counter_init_many to accept a switch_mode flag to avoid reinitializing the base count when switching from atomic to percpu mode
  • Add percpu_counter_atomic_read/add helpers with SMP and !SMP implementations
  • Introduce percpu_counter_switch_to_pcpu_many to lazily allocate percpu storage and switch counters from atomic to percpu mode with GFP_ATOMIC allocation and irq/preempt protection
include/linux/percpu_counter.h
lib/percpu_counter.c
Change mm rss accounting helpers to use atomic mode until percpu storage is allocated, and add a lazy switch API plus destroy helper for mm rss counters.
  • Update get_mm_counter/add_mm_counter/dec_mm_counter to branch on percpu_counter_initialized and use atomic helpers when in atomic mode
  • Introduce mm_counter_sum and mm_counter_sum_positive wrappers that choose between percpu and atomic accessors based on initialization state
  • Add mm_counter_switch_to_pcpu to switch all rss_stat counters to percpu mode and mm_counter_destroy to destroy them when mm is torn down
include/linux/mm.h
Integrate the new rss_stat atomic/percpu behavior into mm lifecycle, fork, and tracing code.
  • Stop preallocating percpu rss_stat counters in mm_init so new mms start in atomic mode
  • On fork, when CLONE_THREAD is set, call mm_counter_switch_to_pcpu on the shared mm so multithreaded processes use percpu rss stats, failing fork with -ENOMEM on allocation failure
  • Replace direct percpu rss_stat sums in check_mm and rss_stat trace event with mm_counter_sum/mm_counter_sum_positive so they work in both atomic and percpu modes
  • Use mm_counter_destroy instead of percpu_counter_destroy_many when dropping mm to match the new lifecycle
kernel/fork.c
include/trace/events/kmem.h

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • The new percpu_counter_switch_to_pcpu_many() uses a bool for ret but returns it as int while intending to propagate -ENOMEM; this will collapse all non‑zero failures to 1 and break callers relying on specific error codes, so ret should be an int and handled consistently with the function’s return contract.
  • The kerneldoc comment on percpu_counter_switch_to_pcpu_many() still talks about switching to/from “atomic mode” in a confusing way (and has typos like perpcu), which makes it hard to understand the intended behavior; updating the wording to clearly describe the atomic→percpu transition would improve maintainability.
  • mm_counter_destroy() always calls percpu_counter_destroy_many() even when the mm’s rss_stat counters were never switched to percpu mode; please double‑check whether destroy_many() is safe when counters == NULL for all entries, and if not, gate the destroy on percpu_counter_initialized().
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The new percpu_counter_switch_to_pcpu_many() uses a `bool` for `ret` but returns it as `int` while intending to propagate `-ENOMEM`; this will collapse all non‑zero failures to 1 and break callers relying on specific error codes, so `ret` should be an `int` and handled consistently with the function’s return contract.
- The kerneldoc comment on percpu_counter_switch_to_pcpu_many() still talks about switching to/from “atomic mode” in a confusing way (and has typos like `perpcu`), which makes it hard to understand the intended behavior; updating the wording to clearly describe the atomic→percpu transition would improve maintainability.
- mm_counter_destroy() always calls percpu_counter_destroy_many() even when the mm’s rss_stat counters were never switched to percpu mode; please double‑check whether destroy_many() is safe when `counters == NULL` for all entries, and if not, gate the destroy on percpu_counter_initialized().

## Individual Comments

### Comment 1
<location> `lib/percpu_counter.c:329` </location>
<code_context>
+{
+	static struct lock_class_key __key;
+	unsigned long flags;
+	bool ret = 0;
+
+	if (percpu_counter_initialized(fbc))
</code_context>

<issue_to_address>
**issue (bug_risk):** The use of `bool ret` loses the original error code from `__percpu_counter_init_many` and misrepresents `-ENOMEM` as `1`.

`__percpu_counter_init_many` returns an `int` status (0 or negative errno), but here its result is stored in a `bool` and then returned from an `int`-returning function. This converts `-ENOMEM` to `1`, breaking callers that rely on negative error codes (e.g. `copy_mm()` expects `-ENOMEM`). Keep `ret` as an `int` and return it directly:

```c
	int ret = 0;
	...
	if (likely(!percpu_counter_initialized(fbc)))
		ret = __percpu_counter_init_many(...);
	...
	return ret;
```
</issue_to_address>

### Comment 2
<location> `lib/percpu_counter.c:211` </location>
<code_context>
 		INIT_LIST_HEAD(&fbc[i].list);
 #endif
-		fbc[i].count = amount;
+		if (likely(!switch_mode))
+			fbc[i].count = amount;
 		fbc[i].counters = (void __percpu *)counters + i * counter_size;
</code_context>

<issue_to_address>
**issue (bug_risk):** Switching from atomic to percpu mode does not preserve the existing atomic count into the percpu counter state.

When `switch_mode` is true, this path skips initializing `fbc[i].count`, while `amount` is 0 and the percpu memory is zeroed via `__GFP_ZERO`. The pre-existing value in `count_atomic` is therefore lost and subsequent `percpu_counter_*` reads incorrectly return 0.

This path should migrate the current atomic value into the percpu state—either by reading `percpu_counter_atomic_read(&fbc[i])` and assigning it to `fbc[i].count` after allocation (keeping per-cpu shards at 0), or by passing that value as `amount` and using the normal initialization instead of skipping it when `switch_mode` is set.
</issue_to_address>

### Comment 3
<location> `lib/percpu_counter.c:316-323` </location>
<code_context>
 EXPORT_SYMBOL(__percpu_counter_compare);

+/*
+ * percpu_counter_switch_to_pcpu_many: Converts struct percpu_counters from
+ * atomic mode to percpu mode.
+ *
+ * Return: 0 if percpu_counter is already in atomic mode or successfully
+ * switched to atomic mode; -ENOMEM if perpcu memory allocation fails,
+ * perpcu_counter is still in atomic mode.
+ */
</code_context>

<issue_to_address>
**suggestion:** The function comment mixes up atomic vs percpu modes and contains typos, which can mislead callers.

The return description is inverted: this helper converts from atomic to percpu and returns 0 when already in percpu mode or successfully switched to percpu, not atomic. Also, "perpcu"/"perpcu_counter" are misspelled. Please correct the mode names and typos to accurately reflect the behavior and avoid confusing callers.

```suggestion
/*
 * percpu_counter_switch_to_pcpu_many: Converts struct percpu_counters from
 * atomic mode to percpu mode.
 *
 * Return: 0 if percpu_counter is already in percpu mode or successfully
 * switched to percpu mode; -ENOMEM if percpu memory allocation fails, in
 * which case percpu_counter remains in atomic mode.
 */
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request implements a performance optimization for memory management RSS (Resident Set Size) statistics by introducing a dual-mode approach for percpu counters. The change aims to reduce the 2-4% performance regression in fork/exec/shell operations introduced by earlier commits that converted RSS stats to always use percpu counters.

Key changes:

  • Introduce atomic mode for percpu counters as an alternative to percpu-backed mode, selected at runtime
  • Delay allocation of percpu memory for RSS stats until a process creates its second thread
  • Add helper functions to transparently read, update, and switch between atomic and percpu modes

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 16 comments.

Show a summary per file
File Description
lib/percpu_counter.c Adds switch_mode parameter to initialization and implements percpu_counter_switch_to_pcpu_many for runtime mode switching
include/linux/percpu_counter.h Defines union for atomic/percpu count storage and declares helper functions for atomic operations and mode switching
include/linux/mm.h Updates mm counter accessors to check mode and use appropriate read/write operations; adds wrapper functions for switching and destruction
kernel/fork.c Removes eager percpu allocation from mm_init, adds lazy switching to percpu mode when creating threads, updates cleanup to use new wrapper
include/trace/events/kmem.h Updates RSS stat tracing to use new mm_counter_sum_positive helper
Comments suppressed due to low confidence (1)

include/linux/mm.h:2657

  • The get_mm_counter_sum function was not updated to handle the new dual-mode counters. It always calls percpu_counter_sum_positive on the counter without checking if it's in atomic or percpu mode. For atomic mode counters, this will attempt to access uninitialized percpu memory. This function should be updated to check the mode similar to the new mm_counter_sum_positive helper.
static inline unsigned long get_mm_counter_sum(struct mm_struct *mm, int member)
{
	return percpu_counter_sum_positive(&mm->rss_stat[member]);

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

#endif
fbc[i].count = amount;
if (likely(!switch_mode))
fbc[i].count = amount;
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When switching from atomic mode to percpu mode, the existing value in count_atomic needs to be migrated to the percpu counter's count field. Currently, when switch_mode is true, fbc[i].count is not initialized at all, which means the accumulated atomic count value is lost. The function should read the current atomic64_read(&fbc[i].count_atomic) value and assign it to fbc[i].count before switching modes to preserve the accumulated RSS statistics.

Suggested change
fbc[i].count = amount;
fbc[i].count = amount;
else
fbc[i].count = atomic64_read(&fbc[i].count_atomic);

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

count 和 count_atomic 是 union,它们在内存中是重叠的。不需要这个操作

Comment on lines +331 to +339
if (percpu_counter_initialized(fbc))
return 0;

preempt_disable();
local_irq_save(flags);
if (likely(!percpu_counter_initialized(fbc)))
ret = __percpu_counter_init_many(fbc, 0,
GFP_ATOMIC|__GFP_NOWARN|__GFP_ZERO,
nr_counters, &__key, true);
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a potential race condition here. The check at line 331 and line 336 both use percpu_counter_initialized(fbc) without proper synchronization between them. If two threads call this function simultaneously, both could see the counter as uninitialized and attempt to initialize it, leading to a memory leak or inconsistent state. Consider using atomic compare-and-swap or ensuring this function is only called under appropriate locks.

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

考虑调用的地方是第一次初始化的时候才会percpu_counter_initialized,第一次初始化的时候是fork,始终为单线程进入此处初始化后才可能多线程。

Comment on lines 2648 to 2652

if (percpu_counter_initialized(fbc))
return percpu_counter_read_positive(fbc);

return percpu_counter_atomic_read(fbc);
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the counter is in atomic mode, percpu_counter_atomic_read returns an s64 which can be negative. However, this function returns unsigned long, and there's no check to ensure non-negative values like percpu_counter_read_positive does. This could lead to incorrect behavior if the atomic counter has a negative value. Consider using max(0, percpu_counter_atomic_read(fbc)) or a similar approach to ensure non-negative results.

Suggested change
if (percpu_counter_initialized(fbc))
return percpu_counter_read_positive(fbc);
return percpu_counter_atomic_read(fbc);
s64 val;
if (percpu_counter_initialized(fbc))
return percpu_counter_read_positive(fbc);
val = percpu_counter_atomic_read(fbc);
if (val < 0)
val = 0;
return (unsigned long)val;

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduced in 34e5523 ("mm: avoid false sharing of mm_counter"),I will take it in next commit.

ZhangPeng added 2 commits January 7, 2026 11:32
maillist inclusion
category: performance
bugzilla: https://gitee.com/openeuler/kernel/issues/I9IA1I
CVE: NA

Reference: https://lore.kernel.org/all/20240418142008.2775308-1-zhangpeng362@huawei.com/

--------------------------------

Depending on whether counters is NULL, we can support two modes:
atomic mode and perpcu mode. We implement both modes by grouping
the s64 count and atomic64_t count_atomic in a union. At the same time,
we create the interface for adding and reading in atomic mode and for
switching atomic mode to percpu mode.

Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
maillist inclusion
category: performance
bugzilla: https://gitee.com/openeuler/kernel/issues/I9IA1I
CVE: NA

Reference: https://lore.kernel.org/all/20240418142008.2775308-1-zhangpeng362@huawei.com/

--------------------------------

Since commit f1a7941 ("mm: convert mm's rss stats into
percpu_counter"), the rss_stats have converted into percpu_counter,
which convert the error margin from (nr_threads * 64) to approximately
(nr_cpus ^ 2). However, the new percpu allocation in mm_init() causes a
performance regression on fork/exec/shell. Even after commit 14ef95b
("kernel/fork: group allocation/free of per-cpu counters for mm struct"),
the performance of fork/exec/shell is still poor compared to previous
kernel versions.

To mitigate performance regression, we delay the allocation of percpu
memory for rss_stats. Therefore, we convert mm's rss stats to use
percpu_counter atomic mode. For single-thread processes, rss_stat is in
atomic mode, which reduces the memory consumption and performance
regression caused by using percpu. For multiple-thread processes,
rss_stat is switched to the percpu mode to reduce the error margin.
We convert rss_stats from atomic mode to percpu mode only when the
second thread is created.

After lmbench test, we can get 2% ~ 4% performance improvement
for lmbench fork_proc/exec_proc/shell_proc and 6.7% performance
improvement for lmbench page_fault (before batch mode[1]).

The test results are as follows:

             base           base+revert        base+this patch

fork_proc    416.3ms        400.0ms  (3.9%)    398.6ms  (4.2%)
exec_proc    2095.9ms       2061.1ms (1.7%)    2047.7ms (2.3%)
shell_proc   3028.2ms       2954.7ms (2.4%)    2961.2ms (2.2%)
page_fault   0.3603ms       0.3358ms (6.8%)    0.3361ms (6.7%)

[1] https://lore.kernel.org/all/20240412064751.119015-1-wangkefeng.wang@huawei.com/

Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
@opsiff opsiff force-pushed the linux-6.12.y-2026-01-07-mm-counter branch from 6297058 to 6d79ca4 Compare January 7, 2026 03:43
deepin inclusion
category: bugfix

Restore the logic introduced in 34e5523 ("mm: avoid false sharing of mm_counter")

Fixes: 6d79ca4 ("openeuler: mm: convert mm's rss stats to use atomic mode")
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
@deepin-ci-robot
Copy link

deepin pr auto review

我来对这个diff进行详细审查:

  1. 代码逻辑和架构改进:
  • 引入了两种计数器模式:atomic模式和percpu模式,这是一个很好的设计优化。
  • 对于单线程进程使用atomic模式,多线程进程使用percpu模式,这种动态切换机制很合理。
  • 通过union结构体实现了count和count_atomic的共享内存,节省了内存空间。
  1. 性能优化:
  • 单线程场景下使用atomic模式避免了percpu带来的性能开销。
  • 多线程场景下使用percpu模式减少了锁竞争。
  • 在percpu_counter_switch_to_pcpu_many中使用了GFP_ATOMIC|__GFP_NOWARN|__GFP_ZERO的标志,确保在原子上下文中安全分配。
  1. 安全性考虑:
  • 在get_mm_counter中对负值进行了特殊处理,返回0而不是负数,避免了潜在的整数溢出问题。
  • 使用了preempt_disable和local_irq_save来保护临界区,确保了并发安全。
  1. 潜在改进建议:
  • 在percpu_counter_switch_to_pcpu_many函数中,建议添加更详细的错误处理注释,说明返回-ENOMEM时的具体影响。
  • mm_counter_switch_to_pcpu函数的返回值检查可能需要更严格的处理,因为切换失败可能影响后续操作。
  • 建议在mm_counter_destroy函数中添加对atomic模式的清理逻辑。
  1. 代码质量:
  • 函数命名清晰,如mm_counter_sum_positive准确表达了函数功能。
  • 注释充分,特别是在关键数据结构和关键函数处。
  • 代码组织良好,相关功能集中在一起。
  1. 其他建议:
  • 考虑添加性能监控代码,以便跟踪两种模式切换的频率和开销。
  • 可以考虑添加调试模式,在开发阶段验证模式切换的正确性。
  • 建议在文档中说明何时选择atomic模式,何时选择percpu模式的具体标准。

总的来说,这是一个很好的改进,通过引入双模式机制优化了不同场景下的性能表现。代码实现考虑了安全性,注释充分,结构清晰。主要需要关注的是错误处理的完善性和边界情况的处理。

@Avenger-285714
Copy link
Member

/approve

@deepin-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Avenger-285714

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

include/linux/mm.h:2664

  • The get_mm_counter_sum function does not handle atomic mode. It directly calls percpu_counter_sum_positive without checking if the counter is in atomic or percpu mode. This will cause incorrect behavior when counters are in atomic mode. This function should follow the same pattern as mm_counter_sum_positive (lines 2703-2711) which checks the mode first.
static inline unsigned long get_mm_counter_sum(struct mm_struct *mm, int member)
{
	return percpu_counter_sum_positive(&mm->rss_stat[member]);

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +334 to +341
preempt_disable();
local_irq_save(flags);
if (likely(!percpu_counter_initialized(fbc)))
ret = __percpu_counter_init_many(fbc, 0,
GFP_ATOMIC|__GFP_NOWARN|__GFP_ZERO,
nr_counters, &__key, true);
local_irq_restore(flags);
preempt_enable();
Copy link

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mode switching in percpu_counter_switch_to_pcpu_many has inadequate synchronization. While it disables preemption and interrupts, there's no memory barrier after the percpu allocation and before other CPUs can observe the new counters pointer. This could lead to other CPUs seeing a non-NULL counters pointer before the percpu memory is fully initialized. An smp_wmb() should be added after successful initialization and smp_rmb() in code paths that check percpu_counter_initialized().

Copilot uses AI. Check for mistakes.
return ret;
}
EXPORT_SYMBOL(__percpu_counter_sum);

Copy link

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The switch_mode parameter in __percpu_counter_init_many lacks documentation. When switch_mode is true, the function is being used to convert an existing atomic-mode counter to percpu mode, and the count value should not be reset. A comment should be added to document this parameter and its purpose.

Suggested change
/*
* Initialize an array of percpu counters.
*
* @switch_mode: when true, the caller is converting an existing
* atomic-mode counter to percpu mode. In that case, the existing
* global count in @fbc[i].count must be preserved and therefore
* is not reset to @amount.
*/

Copilot uses AI. Check for mistakes.
Comment on lines +328 to +341
unsigned long flags;
int ret = 0;

if (percpu_counter_initialized(fbc))
return 0;

preempt_disable();
local_irq_save(flags);
if (likely(!percpu_counter_initialized(fbc)))
ret = __percpu_counter_init_many(fbc, 0,
GFP_ATOMIC|__GFP_NOWARN|__GFP_ZERO,
nr_counters, &__key, true);
local_irq_restore(flags);
preempt_enable();
Copy link

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using local_irq_save with GFP_ATOMIC for percpu allocation is problematic. The allocation uses GFP_ATOMIC which can fail more easily than GFP_KERNEL. Since this is called from copy_mm which is in process context and can sleep, a better approach would be to do the allocation without disabling interrupts (using GFP_KERNEL), then use a lock to atomically check if initialization is still needed and set the counters pointer. The current approach makes memory allocation more likely to fail and returns -ENOMEM to the thread creation, which is unnecessarily fragile.

Suggested change
unsigned long flags;
int ret = 0;
if (percpu_counter_initialized(fbc))
return 0;
preempt_disable();
local_irq_save(flags);
if (likely(!percpu_counter_initialized(fbc)))
ret = __percpu_counter_init_many(fbc, 0,
GFP_ATOMIC|__GFP_NOWARN|__GFP_ZERO,
nr_counters, &__key, true);
local_irq_restore(flags);
preempt_enable();
int ret = 0;
if (percpu_counter_initialized(fbc))
return 0;
if (likely(!percpu_counter_initialized(fbc)))
ret = __percpu_counter_init_many(fbc, 0,
GFP_KERNEL|__GFP_NOWARN|__GFP_ZERO,
nr_counters, &__key, true);

Copilot uses AI. Check for mistakes.
Comment on lines +324 to +339
int percpu_counter_switch_to_pcpu_many(struct percpu_counter *fbc,
u32 nr_counters)
{
static struct lock_class_key __key;
unsigned long flags;
int ret = 0;

if (percpu_counter_initialized(fbc))
return 0;

preempt_disable();
local_irq_save(flags);
if (likely(!percpu_counter_initialized(fbc)))
ret = __percpu_counter_init_many(fbc, 0,
GFP_ATOMIC|__GFP_NOWARN|__GFP_ZERO,
nr_counters, &__key, true);
Copy link

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code uses inconsistent indentation with tabs mixed in. The function parameters and code within the function should use consistent indentation matching the rest of the file (tabs for indentation).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants