Skip to content

Conversation

@fangyu0809
Copy link

针对前面 #183 对应的 vector crypto PR,可能会引入不对齐访问问题,需要 backport 上游提交:
torvalds/linux@1cd5bb6

在 backport 该提交时需要解决一些依赖问题,因此针对这些 patch 整理一个 PR,这些提交不只是解决vector crypto 的可能的问题,还针对riscv内核补充了非对齐访问模拟,非对齐访问探测等功能,详细说明见:

#201

WangJia-UR and others added 30 commits December 30, 2025 23:47
community inclusion
category: bugfix
bugzilla: RVCK-Project#137

--------------------------------

Fix two undefined reference errors that occur when ARCH_SOPHGO is disabled:
1. 'cdns_pcie_get_parent_irq_domain' in pcie-cadence-sophgo.o
2. 'sdhci_send_command' in sdhci-sophgo.o

These errors were preventing successful compilation when building without
Sophgo platform support. The fix ensures proper conditional compilation
and reference resolution regardless of the ARCH_SOPHGO configuration state.

Signed-off-by: Jia Wang <wangjia@ultrarisc.com>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
…tform via CONFIG_ARCH_SOPHGO"

community inclusion
category: bugfix
bugzilla: RVCK-Project#137

--------------------------------

This reverts commit b74bac2.

check_vendor_id() is defined in pcie-cadence-sophgo.c, so
it should be controlled by the CONFIG_PCIE_CADENCE_SOPHGO
macro rather than CONFIG_ARCH_SOPHGO. In conjunction
with the next commit, it is corrected to
CONFIG_PCIE_CADENCE_SOPHGO.

Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
community inclusion
category: bugfix
bugzilla: RVCK-Project#137

--------------------------------

check_vendor_id() is defined in pcie-cadence-sophgo.c.
When the Sophgo platform is not enabled, compilation
will fail. Use the CONFIG_PCIE_CADENCE_SOPHGO macro to
control it, ensuring successful compilation even when
the platform is disabled.

Signed-off-by: Jia Wang <wangjia@ultrarisc.com>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
community inclusion
category: feature
bugzilla: RVCK-Project#71

-------------------------------------------------

Add PLIC early init supports and remove invalid
timer nodes in dp1000.dts.

Signed-off-by: Jia Wang <wangjia@ultrarisc.com>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
community inclusion
category: feature
bugzilla: RVCK-Project#71

-------------------------------------------------

1. Add ARCH_ULTRARISC and PINCTRL_ULTRARISC_DP1000 support
2. MODVERSIONS is selectd by default and does not require
explicit configuration.
3. Set CMA_SIZE_MBYTES to 256; otherwise, the system may
encounter errors during initialization:

```
[    5.206815] cma: cma_alloc: reserved: alloc failed, req-size: 2 pages, ret: -12
[    5.243730] cma: cma_alloc: reserved: alloc failed, req-size: 2 pages, ret: -12

```

On the UltraRISC M-ATX, CMA memory usage:

```
$ sudo cat /proc/meminfo | grep -i cma
CmaTotal:         262144 kB
CmaFree:          184504 kB
```

4. Disable DEFERRED_STRUCT_PAGE_INIT, otherwise, the system
may panic during initialization:

```
[    0.000000] Falling back to deprecated "riscv,isa"
[    0.000000] riscv: base ISA extensions acdfhim
[    0.000000] riscv: ELF capabilities acdfim
[    0.000000] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[    0.000000] Oops [#1]
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Tainted: G                T  6.6.103+ RVCK-Project#92
[    0.000000] Hardware name: ultrarisc,dp1000 (DT)
[    0.000000] epc : __patch_insn_write+0x1a2/0x30e
[    0.000000]  ra : __patch_insn_write+0x106/0x30e
[    0.000000] epc : ffffffff8000725c ra : ffffffff800071c0 sp : ffffffff81c03cb0
[    0.000000]  gp : ffffffff81e30150 tp : ffffffff81c121c0 t0 : 45203a7663736972
[    0.000000]  t1 : 0000000000000072 t2 : 4c45203a76637369 s0 : ffffffff81c03d00
[    0.000000]  s1 : ffffaf83f27a0100 a0 : 0000000000000001 a1 : 0000000000000001
[    0.000000]  a2 : 0000000000476804 a3 : 000000000001feff a4 : 0000000000001fe8
[    0.000000]  a5 : 0000000000000000 a6 : 0000000000000006 a7 : 0000000000000010
[    0.000000]  s2 : ffffffff8000415e s3 : 0000000000000004 s4 : ffffffff80004418
[    0.000000]  s5 : 0000000000000162 s6 : 000000000000015e s7 : ffffffff81e3c5c0
[    0.000000]  s8 : ffffffff8000415e s9 : ffffffffff16c7cc s10: 0000000000000018
[    0.000000]  s11: ffffaf83ffa0f7c0 t3 : ffffffff81e4ea77 t4 : ffffffff81e4ea77
[    0.000000]  t5 : ffffffff81e4ea78 t6 : ffffffff81c03a78
[    0.000000] status: 0000000200000100 badaddr: 0000000000000000 cause: 000000000000000d
[    0.000000] [<ffffffff8000725c>] __patch_insn_write+0x1a2/0x30e
[    0.000000] [<ffffffff80007480>] patch_text_nosync+0x4c/0x8a
[    0.000000] [<ffffffff80003eb0>] riscv_cpufeature_patch_func+0xcc/0x12a
[    0.000000] [<ffffffff80003126>] _apply_alternatives+0x90/0x9c
[    0.000000] [<ffffffff80c03394>] apply_boot_alternatives+0x32/0x11a
[    0.000000] [<ffffffff80c04cba>] setup_arch+0x5f4/0x698
[    0.000000] [<ffffffff80c0085c>] start_kernel+0x92/0x7e4
[    0.000000] Code: 9359 cb89 070e 97ba 639c c789 f693 07f6 0696 97b6 (639c) 0513
[    0.000000] ---[ end trace 0000000000000000 ]---
[    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---

```

Signed-off-by: Jia Wang <wangjia@ultrarisc.com>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.8-rc3
commit e33758f
category: feature
bugzilla: RVCK-Project#127

--------------------------------

For code unification, add emit_sextw wrapper to unify all the 32-bit
sign-extension operations.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.8-rc3
commit 914c7a5
category: feature
bugzilla: RVCK-Project#127

--------------------------------

For code unification, add emit_zextw wrapper to unify all the 32-bit
zero-extension operations.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.8-rc3
commit 361db44
category: feature
bugzilla: RVCK-Project#127

--------------------------------

There are many extension helpers in the current branch instructions, and
the implementation is a bit complicated. We simplify this logic through
two simple extension helpers with alternate register.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.8-rc3
commit 647b93f
category: feature
bugzilla: RVCK-Project#127

--------------------------------

Add necessary Zbb instructions introduced by [0] to reduce code size and
improve performance of RV64 JIT. Meanwhile, a runtime deteted helper is
added to check whether the CPU supports Zbb instructions.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.8-rc3
commit 519fb72
category: feature
bugzilla: RVCK-Project#127

--------------------------------

Add 8-bit and 16-bit sign-extention wraper with Zbb support to optimize
sign-extension mov instructions.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.8-rc3
commit 06a33d0
category: feature
bugzilla: RVCK-Project#127

--------------------------------

Optimize bswap instructions by rev8 Zbb instruction conbined with srli
instruction. And Optimize 16-bit zero-extension with Zbb support.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.10-rc1
commit c12603e
category: feature
bugzilla: RVCK-Project#127

--------------------------------

The Zba extension provides add.uw insn which can be used to implement
zext.w with rs2 set as ZERO.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.10-rc2
commit 96a27ee
category: feature
bugzilla: RVCK-Project#127

--------------------------------

Zba extension is very useful for generating addresses that index into array
of basic data types. This patch introduces sh2add and sh3add helpers for
RV32 and RV64 respectively, to accelerate addressing for array of unsigned
long data.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.14
commit e186c28
category: feature
bugzilla: RVCK-Project#129

--------------------------------

Add parsing for Zfbmin, Zvfbfmin, Zvfbfwma ISA extension which
were ratified in 4dc23d62 ("Added Chapter title to BF16") of
the riscv-isa-manual.

Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.14
commit a4863e0
category: feature
bugzilla: RVCK-Project#129

--------------------------------

Export Zfbmin, Zvfbfmin, Zvfbfwma ISA extension through hwprobe.

Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.14
commit 35173b6
category: feature
bugzilla: RVCK-Project#129

--------------------------------

These 2 new extensions are actually a subset of the A extension which
provides atomic memory operations and load-reserved/store-conditional
instructions.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.14
commit 9d45d1f
category: feature
bugzilla: RVCK-Project#129

--------------------------------

Export the Zaamo and Zalrsc extensions to userspace using hwprobe.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.14
commit 2d79608
category: feature
bugzilla: RVCK-Project#129

--------------------------------

Extend the KVM ISA extension ONE_REG interface to allow KVM user space
to detect and enable Zaamo/Zalrsc extensions for Guest/VM.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.12-rc2
commit 5fc7355
category: feature
bugzilla: RVCK-Project#129

--------------------------------

Some bits in the [ms]envcfg CSR, such as the CFI state and pointer
masking mode, need to be controlled on a per-thread basis. Support this
by keeping a copy of the CSR value in struct thread_struct and writing
it during context switches. It is safe to discard the old CSR value
during the context switch because the CSR is modified only by software,
so the CSR will remain in sync with the copy in thread_struct.
Use ALTERNATIVE directly instead of riscv_has_extension_unlikely() to
minimize branchiness in the context switching code.
Since thread_struct is copied during fork(), setting the value for the
init task sets the default value for all other threads.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.14
commit de70b53
category: feature
bugzilla: RVCK-Project#129

--------------------------------

Enabling cbo.clean and cbo.flush in user mode makes it more
convenient to manage the cache state and achieve better performance.

Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.16-rc1
commit f0f4e64
category: feature
bugzilla: RVCK-Project#129

--------------------------------

The S-type instructions are first introduced and then used to define the
encoding of the Zicbop prefetching instructions.
Co-developed-by: Guo Ren <guoren@kernel.org>

Signed-off-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.16-rc1
commit 8d496b5
category: feature
bugzilla: RVCK-Project#129

--------------------------------

Zicbop introduces cache blocks prefetching instructions, add the
necessary support for the kernel to use it in the coming commits.
Co-developed-by: Guo Ren <guoren@kernel.org>

Signed-off-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.16-rc1
commit a5f947c
category: feature
bugzilla: RVCK-Project#129

--------------------------------

Enable Linux prefetch and prefetchw primitives using Zicbop.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.16-rc1
commit eb87e56
category: feature
bugzilla: RVCK-Project#129

--------------------------------

The cost of changing a cacheline from shared to exclusive state can be
significant, especially when this is triggered by an exclusive store,
since it may result in having to retry the transaction.
This patch makes use of prefetch.w to prefetch cachelines for write
prior to lr/sc loops when using the xchg_small atomic routine.
This patch is inspired by commit 0ea366f ("arm64: atomics:
prefetch the destination word for write prior to stxr").

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
driver inclusion
category: feature
Link: RVCK-Project#102

--------------------------------

This patch adds JSON file for lrw core events, establishing a mapping
mechanism between standard performance event names and hardware event
codes. This allows users to specify monitoring events via semantic
names (e.g., L1D_CACHE_REFILL) instead of traditional hardware
encodings.

Signed-off-by: shenlin <shen.lin1@zte.com.cn>
Signed-off-by: liuqingtao <liu.qingtao2@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.14
commit 4458b8f
category: feature
bugzilla: RVCK-Project#156

--------------------------------

Export Zicntr and Zihpm ISA extensions through the hwprobe syscall.

[ alex: Fix hwprobe numbering ]

Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com>
Acked-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240913051324.8176-1-mikisabate@gmail.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
[Fixed the following key number differences from the mainline: ZFBFMIN,
ZVFBFMIN, ZVFBFWMA, ZAAMO, ZALRSC]
Signed-off-by: Mingzheng Xing <xingmingzheng@iscas.ac.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.14
commit eb10039
category: feature
bugzilla: RVCK-Project#156

--------------------------------

Expose Zicbom through hwprobe and also provide a key to extract its
respective block size.

[ alex: Fix merge conflicts and hwprobe numbering ]

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://lore.kernel.org/r/20250226063206.71216-3-cuiyunhui@bytedance.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
[Resolved merge conflict; Currently RISC_HWPROBE_MAX_KEY has difference.]
Signed-off-by: Mingzheng Xing <xingmingzheng@iscas.ac.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.15-rc7
commit ced6337
category: featrue
bugzilla: RVCK-Project#152

--------------------------------

ACPICA commit 73c32bc89cad64ab19c1231a202361e917e6823c

RISC-V IO Mapping Table (RIMT) is a new static table defined for RISC-V
to communicate IOMMU information to the OS. The specification for RIMT
is available at [1]. Add structure definitions for RIMT.

Link: https://github.com/riscv-non-isa/riscv-acpi-rimt [1]
Link: acpica/acpica@73c32bc8
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/10665648.nUPlyArG6x@rjwysocki.net
Signed-off-by: uestc-gr <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.17-rc5
commit 8f77295
category: featrue
bugzilla: RVCK-Project#152

--------------------------------

RISC-V IO Mapping Table (RIMT) is a static ACPI table to communicate
IOMMU information to the OS. The spec is available at [1].

The changes at high level are,
	a) Initialize data structures required for IOMMU/device
	   configuration using the data from RIMT. Provide APIs required
	   for device configuration.
	b) Provide an API for IOMMU drivers to register the
	   fwnode with RIMT data structures. This API will create a
	   fwnode for PCIe IOMMU.

[1] - https://github.com/riscv-non-isa/riscv-acpi-rimt

Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20250818045807.763922-2-sunilvl@ventanamicro.com
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: uestc-gr <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
mainline inclusion
from mainline-v6.17-rc5
commit cbf4fbc
category: featrue
bugzilla: RVCK-Project#152

--------------------------------

acpi_iommu_configure_id() currently supports only IORT (ARM) and VIOT.
Add support for RISC-V as well.

Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20250818045807.763922-3-sunilvl@ventanamicro.com
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: uestc-gr <gao.rui@zte.com.cn>
Signed-off-by: Yanteng Si <si.yanteng@linux.dev>
clementleger and others added 25 commits January 30, 2026 14:06
mainline inclusion
from Linux 6.10-rc1
commit 4413815
category: feature
bugzilla: RVCK-Project#201

--------------------------------

While reworking code to fix sparse errors, it appears that the
RISCV_M_MODE specific could actually be removed and use the one for
normal mode. Even though RISCV_M_MODE can do direct user memory access,
using the user uaccess helpers is also going to work. Since there is no
need anymore for specific accessors (load_u8()/store_u8()), we can
directly use memcpy()/copy_{to/from}_user() and get rid of the copy
loop entirely. __read_insn() is also fixed to use an unsigned long
instead of a pointer which was cast in __user address space. The
insn_addr parameter is now cast from unsigned lnog to the correct
address space directly.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240206154104.896809-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.9-rc1
commit c70dfa4
category: feature
bugzilla: RVCK-Project#201

--------------------------------

Add an implementation of cts(cbc(aes)) accelerated using the Zvkned
RISC-V vector crypto extension.  This is mainly useful for fscrypt,
where cts(cbc(aes)) is the "default" filenames encryption algorithm.  In
that use case, typically most messages are short and are block-aligned.
The CBC-CTS variant implemented is CS3; this is the variant Linux uses.

To perform well on short messages, the new implementation processes the
full message in one call to the assembly function if the data is
contiguous.  Otherwise it falls back to CBC operations followed by CTS
at the end.  For decryption, to further improve performance on short
messages, especially block-aligned messages, the CBC-CTS assembly
function parallelizes the AES decryption of all full blocks.  This
improves on the arm64 implementation of cts(cbc(aes)), which always
splits the CBC part(s) from the CTS part, doing the AES decryptions for
the last two blocks serially and usually loading the round keys twice.

Tested in QEMU with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20240213055442.35954-1-ebiggers@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.9-rc1
commit 5a83e73
category: feature
bugzilla: RVCK-Project#201

--------------------------------

Create has_fast_unaligned_access to avoid needing to explicitly check
the fast_misaligned_access_speed_key static key.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20240308-disable_misaligned_probe_config-v9-1-a388770ba0ce@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.9-rc1
commit 313130c
category: feature
bugzilla: RVCK-Project#201

--------------------------------

The unaligned access checker only sets valid values for online cpus.
Check for these values on online cpus rather than on present cpus.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Fixes: 71c54b3 ("riscv: report misaligned accesses emulation to hwprobe")
Tested-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20240308-disable_misaligned_probe_config-v9-2-a388770ba0ce@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.9-rc1
commit 6e5ce7f
category: feature
bugzilla: RVCK-Project#201

--------------------------------

Detecting if a system traps into the kernel on an unaligned access
can be performed separately from checking the speed of unaligned
accesses. This decoupling will make it possible to selectively enable
or disable each of these checks.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20240308-disable_misaligned_probe_config-v9-3-a388770ba0ce@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.9-rc1
commit f413aae
category: feature
bugzilla: RVCK-Project#201

--------------------------------

Introduce Kconfig options to set the kernel unaligned access support.
These options provide a non-portable alternative to the runtime
unaligned access probe.

To support this, the unaligned access probing code is moved into it's
own file and gated behind a new RISCV_PROBE_UNALIGNED_ACCESS_SUPPORT
option.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20240308-disable_misaligned_probe_config-v9-4-a388770ba0ce@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.10-rc1
commit 7e6eae2
category: feature
bugzilla: RVCK-Project#201

--------------------------------

Signed-off-by: Xingyou Chen <rockrush@rockwork.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20240317055556.9449-1-rockrush@rockwork.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.11-rc4
commit c42e2f0
category: feature
bugzilla: RVCK-Project#201

--------------------------------

RISCV_HWPROBE_KEY_CPUPERF_0 was mistakenly flagged as a bitmask in
hwprobe_key_is_bitmask(), when in reality it was an enum value. This
causes problems when used in conjunction with RISCV_HWPROBE_WHICH_CPUS,
since SLOW, FAST, and EMULATED have values whose bits overlap with
each other. If the caller asked for the set of CPUs that was SLOW or
EMULATED, the returned set would also include CPUs that were FAST.

Introduce a new hwprobe key, RISCV_HWPROBE_KEY_MISALIGNED_PERF, which
returns the same values in response to a direct query (with no flags),
but is properly handled as an enumerated value. As a result, SLOW,
FAST, and EMULATED are all correctly treated as distinct values under
the new key when queried with the WHICH_CPUS flag.

Leave the old key in place to avoid disturbing applications which may
have already come to rely on the key, with or without its broken
behavior with respect to the WHICH_CPUS flag.

Fixes: e178bf1 ("RISC-V: hwprobe: Introduce which-cpus flag")
Signed-off-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240809214444.3257596-2-evan@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.11-rc4
commit 1f52888
category: feature
bugzilla: RVCK-Project#201

--------------------------------

In preparation for misaligned vector performance hwprobe keys, rename
the hwprobe key values associated with misaligned scalar accesses to
include the term SCALAR. Leave the old defines in place to maintain
source compatibility.

This change is intended to be a functional no-op.

Signed-off-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240809214444.3257596-3-evan@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.11-rc7
commit b686ecd
category: feature
bugzilla: RVCK-Project#201

--------------------------------

raw_copy_{to,from}_user() do not call access_ok(), so this code allowed
userspace to access any virtual memory address.

Cc: stable@vger.kernel.org
Fixes: 7c83232 ("riscv: add support for misaligned trap handling in S-mode")
Fixes: 4413815 ("riscv: misaligned: remove CONFIG_RISCV_M_MODE specific code")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240815005714.1163136-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.13-rc1
commit 8d20a73
category: feature
bugzilla: RVCK-Project#201

--------------------------------

Originally, the check_unaligned_access_emulated_all_cpus function
only checked the boot hart. This fixes the function to check all
harts.

Fixes: 71c54b3 ("riscv: report misaligned accesses emulation to hwprobe")
Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-1-5b33500160f8@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.13-rc1
commit 9c528b5
category: feature
bugzilla: RVCK-Project#201

--------------------------------

The check_unaligned_access_emulated() function should have been called
during CPU hotplug to ensure that if all CPUs had emulated unaligned
accesses, the new CPU also does.

This patch adds the call to check_unaligned_access_emulated() in
the hotplug path.

Fixes: 55e0bf4 ("RISC-V: Probe misaligned access speed in parallel")
Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-2-5b33500160f8@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.13-rc1
commit c05a62c
category: feature
bugzilla: RVCK-Project#201

--------------------------------

Replace RISCV_MISALIGNED with RISCV_SCALAR_MISALIGNED to allow
for the addition of RISCV_VECTOR_MISALIGNED in a later patch.

Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-3-5b33500160f8@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.13-rc1
commit d1703dc
category: feature
bugzilla: RVCK-Project#201

--------------------------------

Run an unaligned vector access to test if the system supports
vector unaligned access. Add the result to a new key in hwprobe.
This is useful for usermode to know if vector misaligned accesses are
supported and if they are faster or slower than equivalent byte accesses.

Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-4-5b33500160f8@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.13-rc1
commit e7c9d66
category: feature
bugzilla: RVCK-Project#201

--------------------------------

Detect if vector misaligned accesses are faster or slower than
equivalent vector byte accesses. This is useful for usermode to know
whether vector byte accesses or vector misaligned accesses have a better
bandwidth for operations like memcpy.

Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-5-5b33500160f8@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.13-rc1
commit 40e09eb
category: feature
bugzilla: RVCK-Project#201

--------------------------------

Document key for reporting the speed of unaligned vector accesses.
The descriptions are the same as the scalar equivalent values.

Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-6-5b33500160f8@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.15-rc1
commit a00e022
category: feature
bugzilla: RVCK-Project#201

--------------------------------

Several functions used in unaligned access probing are only run at
init time. Annotate them appropriately.

Fixes: f413aae ("riscv: Set unaligned access speed at compile time")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-11-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.15-rc1
commit 5af72a8
category: feature
bugzilla: RVCK-Project#201

--------------------------------

We shouldn't probe when we already know vector is unsupported and
we should probe when we see we don't yet know whether it's supported.
Furthermore, we should ensure we've set the access type to
unsupported when we don't have vector at all.

Fixes: e7c9d66 ("RISC-V: Report vector unaligned access speed hwprobe")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-12-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.15-rc1
commit e6d0adf
category: feature
bugzilla: RVCK-Project#201

--------------------------------

check_vector_unaligned_access_emulated_all_cpus(), like its name
suggests, will return true when all cpus emulate unaligned vector
accesses. If the function returned false it may have been because
vector isn't supported at all (!has_vector()) or because at least
one cpu doesn't emulate unaligned vector accesses. Since false may
be returned for two cases, checking for it isn't sufficient when
attempting to determine if we should proceed with the vector speed
check. Move the !has_vector() functionality to
check_unaligned_access_all_cpus() in order for
check_vector_unaligned_access_emulated_all_cpus() to return false
for a single case.

Fixes: e7c9d66 ("RISC-V: Report vector unaligned access speed hwprobe")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-13-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.15-rc1
commit 813d39b
category: feature
bugzilla: RVCK-Project#201

--------------------------------

The return value of check_unaligned_access_speed_all_cpus() is always
zero, so make the function void so we don't need to concern ourselves
with it. The change also allows us to tidy up
check_unaligned_access_all_cpus() a bit.

Reviewed-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-14-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.15-rc1
commit 05ee21f
category: feature
bugzilla: RVCK-Project#201

--------------------------------

CPU hotplug callbacks should be set up even if we detected all
current cpus emulate misaligned accesses, since we want to
ensure our expectations of all cpus emulating is maintained.

Fixes: 6e5ce7f ("riscv: Decouple emulated unaligned accesses from access speed")
Fixes: e7c9d66 ("RISC-V: Report vector unaligned access speed hwprobe")
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-15-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.15-rc1
commit 2744ec4
category: feature
bugzilla: RVCK-Project#201

--------------------------------

Whether or not we have RISCV_PROBE_VECTOR_UNALIGNED_ACCESS we need to
set up a cpu hotplug callback to check if we have vector at all,
since, when we don't have vector, we need to set
vector_misaligned_access to unsupported rather than leave it the
default of unknown.

Fixes: e7c9d66 ("RISC-V: Report vector unaligned access speed hwprobe")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-16-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.15-rc1
commit aecb09e
category: feature
bugzilla: RVCK-Project#201

--------------------------------

Allow skipping scalar and vector unaligned access speed tests. This
is useful for testing alternative code paths and to skip the tests in
environments where they run too slowly. All CPUs must have the same
unaligned access speed.

The code movement is because we now need the scalar cpu hotplug
callback to always run, so we need to bring it and its supporting
functions out of CONFIG_RISCV_PROBE_UNALIGNED_ACCESS.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-17-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.15-rc1
commit 9fe5853
category: feature
bugzilla: RVCK-Project#201

--------------------------------

Document riscv parameters used to select scalar and vector unaligned
access speeds.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-18-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
mainline inclusion
from Linux 6.19-rc1
commit 1cd5bb6
category: feature
bugzilla: RVCK-Project#201

--------------------------------

Replace the RISCV_ISA_V dependency of the RISC-V crypto code with
RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS, which implies RISCV_ISA_V as
well as vector unaligned accesses being efficient.

This is necessary because this code assumes that vector unaligned
accesses are supported and are efficient.  (It does so to avoid having
to use lots of extra vsetvli instructions to switch the element width
back and forth between 8 and either 32 or 64.)

This was omitted from the code originally just because the RISC-V kernel
support for detecting this feature didn't exist yet.  Support has now
been added, but it's fragmented into per-CPU runtime detection, a
command-line parameter, and a kconfig option.  The kconfig option is the
only reasonable way to do it, though, so let's just rely on that.

Fixes: eb24af5 ("crypto: riscv - add vector crypto accelerated AES-{ECB,CBC,CTR,XTS}")
Fixes: bb54668 ("crypto: riscv - add vector crypto accelerated ChaCha20")
Fixes: 600a385 ("crypto: riscv - add vector crypto accelerated GHASH")
Fixes: 8c8e404 ("crypto: riscv - add vector crypto accelerated SHA-{256,224}")
Fixes: b341592 ("crypto: riscv - add vector crypto accelerated SHA-{512,384}")
Fixes: 563a525 ("crypto: riscv - add vector crypto accelerated SM3")
Fixes: b8d0635 ("crypto: riscv - add vector crypto accelerated SM4")
Cc: stable@vger.kernel.org
Reported-by: Vivian Wang <wangruikang@iscas.ac.cn>
Closes: https://lore.kernel.org/r/b3cfcdac-0337-4db0-a611-258f2868855f@iscas.ac.cn/
Reviewed-by: Jerry Shih <jerry.shih@sifive.com>
Link: https://lore.kernel.org/r/20251206213750.81474-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
@github-actions
Copy link

github-actions bot commented Jan 30, 2026


开始测试 log: https://github.com/RVCK-Project/rvck/actions/runs/21506396648

参数解析结果
args value
repository RVCK-Project/rvck
head ref pull/207/head
base ref rvck-6.6
LAVA repo RVCK-Project/lavaci
LAVA Template lava-job-template/qemu/qemu-ltp.yaml
Testcase path lava-testcases/common-test/ltp/ltp.yaml
need run job kunit-test,kernel-build,check-patch,lava-trigger

测试完成

详细结果:

RVCK result

check result
kunit-test success
kernel-build success
lava-trigger success
check-patch failure

Kunit Test Result

[06:18:06] Testing complete. Ran 457 tests: passed: 445, skipped: 12

Kernel Build Result

Kernel build succeeded: RVCK-Project/rvck/207/

7958258d30bcb3f72adbb8f4c89a315e /srv/guix_result/b51fdce4517f21b73ea1f49bb070e53cb5f37baf/Image
0518ff13ef0229fdda6ae5057a681c7b /root/initramfs.img

LAVA Check

args:

result:

Lava check done! lava log: https://lava.oerv.ac.cn/scheduler/job/1242

lava result count: [fail]: 173, [pass]: 1435, [skip]: 291

Check Patch Result

Total Errors 5
Total Warnings 72

@sterling-teng
Copy link
Contributor

分支已经滚动,请尽快rebase。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.