-
Notifications
You must be signed in to change notification settings - Fork 53
CXL fixes and enhancements for type 3 device support #276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 24.04_linux-nvidia-6.17-next
Are you sure you want to change the base?
Conversation
nvmochs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verified these match the content from -devel.
Acked-by: Matthew R. Ochs <mochs@nvidia.com>
drivers/pci/pci.c
Outdated
| if (rc) | ||
| return -ENOTTY; | ||
|
|
||
| if (reg & CXL_DVSEC_CXL_RST_CAPABLE == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing a mismatch here. In the -devel kernel, this line is:
if ((reg & CXL_DVSEC_CXL_RST_CAPABLE) == 0)
clsotog
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
giving my acked. For the question of Jamie, I think the PR of devel maybe changed that line because I think what nirmoy put is one the lore discussion added at the commit.
Acked-by: Carol L Soto <csoto@nvidia.com>
Type 2 devices are being introduced and will require finer-grained reset mechanisms beyond bus-wide reset methods. Add support for CXL reset per CXL v3.2 Section 9.6/9.7 Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com> (backported from https://lore.kernel.org/all/20250221043906.1593189-3-smadhavan@nvidia.com/) [Nirmoy: Add #include "../cxl/cxlpci.h" and fix a compile error with if (reg & CXL_DVSEC_CXL_RST_CAPABLE == 0)] Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
The cxl core in linux updated to supported committed decoders of zero size, because this is allowed by the CXL spec. This patch updates cxl_test to enable decoders 1 and 2 in the host-bridge 0 port, in a switch uport under hb0, and the endpoints ports with size zero simulating committed zero sized decoders. Signed-off-by: Vishal Aslot <vaslot@nvidia.com> (backported from https://lore.kernel.org/all/20251015024019.1189713-1-vaslot@nvidia.com/) Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
CXL spec permits committing zero sized decoders. Linux currently considers them as an error. Zero-sized decoders are helpful when the BIOS is committing them. Often BIOS will also lock them to prevent them being changed due to the TSP requirement. For example, if the type 3 device is part of a TCB. The host bridge, switch, and end-point decoders can all be committed with zero-size. If they are locked along the VH, it is often to prevent hotplugging of a new device that could not be attested post boot and cannot be included in TCB. The caller leaves the decoder allocated but does not add it. It simply continues to the next decoder. Signed-off-by: Vishal Aslot <vaslot@nvidia.com> (backported from https://lore.kernel.org/all/20251015024019.1189713-1-vaslot@nvidia.com/) Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
The loop condition in __cxl_dpa_reserve() is missing the comparison
operator, causing potential infinite loop and array out-of-bounds:
for (int i = 0; cxlds->nr_partitions; i++)
Should be:
for (int i = 0; i < cxlds->nr_partitions; i++)
Without the '<' operator, if no partition matches the decoder's DPA
resource, 'i' increments beyond the part[] array bounds (size 2),
triggering UBSAN errors and corrupting the part index.
Fixes: be5cbd0 ("cxl: Kill enum cxl_decoder_mode")
Signed-off-by: Koba Ko <kobak@nvidia.com>
Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
…access Check partition index bounds before accessing cxlds->part[] to prevent out-of-bounds when part is -1 or invalid. Fixes: 5ec6759) cxl/region: Drop goto pattern of construct_region() Signed-off-by: Koba Ko <kobak@nvidia.com> Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
|
Added ```[Nirmoy: Add #include "../cxl/cxlpci.h" and fix a compile error with if (reg & CXL_DVSEC_CXL_RST_CAPABLE == 0)] |
|
|
|
|
Adds commit for cxl type 3 device support for Vera.
LP: https://bugs.launchpad.net/ubuntu/+source/linux-nvidia/+bug/2138266