Skip to content

Add Supervisor Counter Delegation prototype#1

Open
atulkharerivos wants to merge 1 commit intorivosinc:masterfrom
atulkharerivos:dev/sscdeleg
Open

Add Supervisor Counter Delegation prototype#1
atulkharerivos wants to merge 1 commit intorivosinc:masterfrom
atulkharerivos:dev/sscdeleg

Conversation

@atulkharerivos
Copy link

This adds a prototype to detect the features defined by the proposal. If enumerated, we append "sscountdeleg" to the ISA string.

This adds a prototype to detect the features defined by the proposal.
If enumerated, we append "sscountdeleg" to the ISA string.
KevinRSX pushed a commit to KevinRSX/opensbi that referenced this pull request Jun 16, 2023
…ready

When a boot hart executes sbi_hsm_hart_start() to start a secondary hart,
next_arg1, next_addr and next_mode for the latter are stored in the scratch
area after the state has been set to SBI_HSM_STATE_START_PENDING.

The secondary hart waits in the loop with wfi() in sbi_hsm_hart_wait() at
that time. However, "wfi" instruction is not guaranteed to wait for an
interrupt to be received by the hart, it is just a hint for the CPU.
According to RISC-V Privileged Architectures spec. v20211203, even an
implementation of "wfi" as "nop" is legal.

So, the secondary might leave the loop in sbi_hsm_hart_wait() as soon as
its state has been set to SBI_HSM_STATE_START_PENDING, even if it got no
IPI or it got an IPI unrelated to sbi_hsm_hart_start(). This could lead to
the following race condition when booting Linux, for example:

  Boot hart (#0)                        Secondary hart (rivosinc#1)
  runs Linux startup code               waits in sbi_hsm_hart_wait()

  sbi_ecall(SBI_EXT_HSM,
            SBI_EXT_HSM_HART_START,
            ...)
  enters sbi_hsm_hart_start()
  sets state of hart rivosinc#1 to START_PENDING
                                        leaves sbi_hsm_hart_wait()
                                        runs to the end of init_warmboot()
                                        returns to scratch->next_addr
                                        (next_addr can be garbage here)

  sets next_addr, etc. for hart rivosinc#1
  (no good: hart rivosinc#1 has already left)

  sends IPI to hart rivosinc#1
  (no good either)

If this happens, the secondary hart jumps to a wrong next_addr at the end
of init_warmboot(), which leads to a system hang or crash.

To reproduce the issue more reliably, one could add a delay in
sbi_hsm_hart_start() after setting the hart's state but before sending
IPI to that hart:

    hstate = atomic_cmpxchg(&hdata->state, SBI_HSM_STATE_STOPPED,
                            SBI_HSM_STATE_START_PENDING);
    ...
  + sbi_timer_mdelay(10);
    init_count = sbi_init_count(hartid);
    rscratch->next_arg1 = arg1;
    rscratch->next_addr = saddr;

The issue can be reproduced, for example, in a QEMU VM with '-machine virt'
and 2 or more CPUs, with Linux as the guest OS.

This patch moves writing of next_arg1, next_addr and next_mode for the
secondary hart before setting its state to SBI_HSM_STATE_START_PENDING.

In theory, it is possible that two or more harts enter sbi_hsm_hart_start()
for the same target hart simultaneously. To make sure the current hart has
exclusive access to the scratch area of the target hart at that point, a
per-hart 'start_ticket' is used. It is initially 0. The current hart tries
to acquire the ticket first (set it to 1) at the beginning of
sbi_hsm_hart_start() and only proceeds if it has successfully acquired it.

The target hart reads next_addr, etc., and then the releases the ticket
(sets it to 0) before calling sbi_hart_switch_mode(). This way, even if
some other hart manages to enter sbi_hsm_hart_start() after the ticket has
been released but before the target hart jumps to next_addr, it will not
cause problems.

atomic_cmpxchg() already has "acquire" semantics, among other things, so
no additional barriers are needed in hsm_start_ticket_acquire(). No hart
can perform or observe the update of *rscratch before setting of
'start_ticket' to 1.

atomic_write() only imposes ordering of writes, so an explicit barrier is
needed in hsm_start_ticket_release() to ensure its "release" semantics.
This guarantees that reads of scratch->next_addr, etc., in
sbi_hsm_hart_start_finish() cannot happen after 'start_ticket' has been
released.

Signed-off-by: Evgenii Shatokhin <e.shatokhin@yadro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
clementleger pushed a commit that referenced this pull request Feb 3, 2025
Instead of having one common FDT MPXY RPMI mailbox client drivers
for various RPMI service groups, split this driver into two parts:
1) Common MPXY RPMI mailbox client library
2) MPXY driver for RPMI clock service group

The above split enables having a separate MPXY driver for each
RPMI clock service group and #1 (above) will allow code sharing
between various MPXY RPMI drivers.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
clementleger pushed a commit that referenced this pull request Mar 7, 2025
Instead of having one common FDT MPXY RPMI mailbox client drivers
for various RPMI service groups, split this driver into two parts:
1) Common MPXY RPMI mailbox client library
2) MPXY driver for RPMI clock service group

The above split enables having a separate MPXY driver for each
RPMI clock service group and #1 (above) will allow code sharing
between various MPXY RPMI drivers.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
@meta-cla
Copy link

meta-cla bot commented Jan 14, 2026

Hi @atulkharerivos!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant