diff --git a/doc/dpuNetworkCR_Create.pdf b/doc/dpuNetworkCR_Create.pdf new file mode 100644 index 000000000..c63ce246d Binary files /dev/null and b/doc/dpuNetworkCR_Create.pdf differ diff --git a/doc/dpuNetworkCR_Delete.pdf b/doc/dpuNetworkCR_Delete.pdf new file mode 100644 index 000000000..0638515d2 Binary files /dev/null and b/doc/dpuNetworkCR_Delete.pdf differ diff --git a/doc/dpuNetworkCR_Update.pdf b/doc/dpuNetworkCR_Update.pdf new file mode 100644 index 000000000..f819cac85 Binary files /dev/null and b/doc/dpuNetworkCR_Update.pdf differ diff --git a/doc/dpunetwork_cr_create.puml b/doc/dpunetwork_cr_create.puml new file mode 100644 index 000000000..3b99590f0 --- /dev/null +++ b/doc/dpunetwork_cr_create.puml @@ -0,0 +1,175 @@ +@startuml dpunetwork_cr_creation + +actor user +box "Kubernetes Control Plane" +participant k8s_api +participant dpu_network_controller +participant configmap as "ConfigMap\ndpu-device-plugin-config" +end box + +box "Host Node" +participant kubelet_host as "kubelet (Host)" +participant dpu_daemon_host as "dpu-daemon (Host)\n(Device Plugin Manager + Device Plugin)" +participant vsp_host as "vsp (Host)" +end box + +box "DPU Node" +participant kubelet_dpu as "kubelet (DPU)" +participant dpu_daemon_dpu as "dpu-daemon (DPU)\n(Device Plugin Manager + Device Plugin)" +participant vsp_dpu as "vsp (DPU)" +end box + +autonumber + +== DpuNetwork CR Creation (Multiple Networks) == + +user -> k8s_api: Create DpuNetwork CR 1 +activate k8s_api +note right: **DpuNetwork 1: "dpu-network-1"**\n\napiVersion: networking.example.com/v1\nkind: DpuNetwork\nmetadata:\n name: dpu-network-1\nspec:\n nodeSelector:\n matchLabels:\n node-role: dpu-node\n dpuSelector:\n matchExpressions:\n - key: dpu-type\n operator: In\n values: ["IPU Adapter E2100"]\n - key: vfId\n operator: In\n values: ["0-3", "5-7"]\n IsDisruptive: true + +k8s_api -> dpu_network_controller: Reconcile Event +activate dpu_network_controller + +== DpuNetwork Controller Reconciliation == + +dpu_network_controller -> k8s_api: List Nodes (match nodeSelector) +activate k8s_api +k8s_api -> dpu_network_controller: Matching Nodes +deactivate k8s_api + +dpu_network_controller -> k8s_api: List Dpu CRs +activate k8s_api +k8s_api -> dpu_network_controller: All Dpu CRs +note right: Dpu CR contains:\n netdevs:\n - name: "ens2f0v0" vfId: 0\n - name: "ens2f0v1" vfId: 1\n - name: "ens2f0v2" vfId: 2\n - name: "ens2f0v3" vfId: 3\n - name: "ens2f0v4" vfId: 4\n - name: "ens2f0v5" vfId: 5\n - name: "ens2f0v6" vfId: 6\n - name: "ens2f0v7" vfId: 7 +deactivate k8s_api + +dpu_network_controller -> dpu_network_controller: Evaluate dpuSelector\n(match dpu-type and vfId) + +dpu_network_controller -> dpu_network_controller: Parse vfId ranges\n("0-3" -> [0,1,2,3]\n"5-7" -> [5,6,7]) + +dpu_network_controller -> dpu_network_controller: Filter VFs from Dpu CRs\n(match selected VFs: 0,1,2,3,5,6,7) + +dpu_network_controller -> dpu_network_controller: Generate ResourceName\n"openshift.io/dpunetwork-" + +== ConfigMap-Based Device Plugin Registration == + +dpu_network_controller -> dpu_network_controller: Aggregate all DpuNetwork CRs\nfor this node + +dpu_network_controller -> dpu_network_controller: Generate ConfigMap data\n(config.json with resource definitions) + +dpu_network_controller -> k8s_api: Create/Update ConfigMap\ndpu-device-plugin-config +activate k8s_api +note right: **ConfigMap Approach (Single Source of Truth)**\n\n**One ConfigMap describes resources for both Host and DPU nodes.**\n**Each entry carries a nodeSelector so local daemons only advertise their slice.**\n\napiVersion: v1\nkind: ConfigMap\nmetadata:\n name: dpu-device-plugin-config\n namespace: dpu-operator-system\ndata:\n config.json: |\n {\n "resources": [\n {\n "resourceName": "openshift.io/dpunetwork-dpu-network-1",\n "dpuNetworkName": "dpu-network-1",\n "nodeSelector": {"matchLabels": {"node-role": "host"}},\n "vfRanges": ["0-3", "5-7"]\n },\n {\n "resourceName": "openshift.io/dpunetwork-dpu-network-1",\n "dpuNetworkName": "dpu-network-1",\n "nodeSelector": {"matchLabels": {"node-role": "dpu"}},\n "vfRanges": ["0-3", "5-7"],\n "rpmRanges": ["0-0"],\n "vethRanges": ["0-1"]\n }\n // Additional resources per DpuNetwork CR\n ]\n } +k8s_api -> configmap: ConfigMap Created/Updated +activate configmap +k8s_api -> dpu_network_controller: ConfigMap Updated +deactivate k8s_api + +== Host dpu-daemon Watches ConfigMap == + +configmap -> dpu_daemon_host: ConfigMap Change Event\n(watch notification) +activate dpu_daemon_host + +dpu_daemon_host -> k8s_api: Get ConfigMap\ndpu-device-plugin-config +activate k8s_api +k8s_api -> dpu_daemon_host: ConfigMap with config.json +deactivate k8s_api + +dpu_daemon_host -> dpu_daemon_host: Parse config.json\nFilter entries where node-role = host + +note over dpu_daemon_host: **Per-Node Architecture Decision:**\n**Single device plugin instance per node**\n- Host instance only advertises host-scoped resources\n- Reads shared ConfigMap, filters via nodeSelector\n- Updates in-place on ConfigMap changes + +alt Host Device Plugin Not Running + dpu_daemon_host -> dpu_daemon_host: Start Device Plugin Instance\n(read host resources) +else Host Device Plugin Already Running + dpu_daemon_host -> dpu_daemon_host: Reload Config\n(apply new host resource set) +end + +dpu_daemon_host -> vsp_host: GetDevices() +activate vsp_host +vsp_host -> vsp_host: Return host-visible devices\n(VF repr set shared with DPU) +vsp_host -> dpu_daemon_host: Host device inventory +deactivate vsp_host + +dpu_daemon_host -> dpu_daemon_host: Build device list\nApply vfRanges [0-3,5-7] +note right: Host Resource\n"openshift.io/dpunetwork-dpu-network-1"\nDevices: VFs 0,1,2,3,5,6,7 (no RPM/veth) + +dpu_daemon_host -> dpu_daemon_host: ListAndWatch()\n(advertise host resource only) + +dpu_daemon_host -> kubelet_host: Register Device Plugin\nResource "openshift.io/dpunetwork-dpu-network-1" +activate kubelet_host +kubelet_host -> dpu_daemon_host: Registration Accepted +kubelet_host -> kubelet_host: Add node capacity\n"openshift.io/dpunetwork-dpu-network-1": 7 (host) +deactivate kubelet_host + +deactivate dpu_daemon_host + +== DPU dpu-daemon Watches ConfigMap == + +configmap -> dpu_daemon_dpu: ConfigMap Change Event\n(watch notification) +activate dpu_daemon_dpu + +dpu_daemon_dpu -> k8s_api: Get ConfigMap\ndpu-device-plugin-config +activate k8s_api +k8s_api -> dpu_daemon_dpu: ConfigMap with config.json +deactivate k8s_api + +dpu_daemon_dpu -> dpu_daemon_dpu: Parse config.json\nFilter entries where node-role = dpu + +note over dpu_daemon_dpu: **Per-Node Architecture Decision:**\n**Single DPU-side device plugin instance**\n- Reads same ConfigMap, filters for node-role=dpu\n- Advertises VF + RPM + veth resources\n- No restart required on updates + +alt DPU Device Plugin Not Running + dpu_daemon_dpu -> dpu_daemon_dpu: Start Device Plugin Instance\n(read DPU resources) +else DPU Device Plugin Already Running + dpu_daemon_dpu -> dpu_daemon_dpu: Reload Config\n(apply new DPU resource set) +end + +dpu_daemon_dpu -> vsp_dpu: GetDevices() +activate vsp_dpu +vsp_dpu -> vsp_dpu: Return devices by type\n(VF repr, RPM, veth) +vsp_dpu -> dpu_daemon_dpu: DPU device inventory +deactivate vsp_dpu + +dpu_daemon_dpu -> dpu_daemon_dpu: Build device lists\n- VF repr filtered by vfRanges\n- RPM list via rpmRanges\n- veth list via vethRanges +note right: DPU Resources Advertised\n1. "openshift.io/dpunetwork-dpu-network-1" (VF x7)\n2. "openshift.io/rpm-disruptive" (rpmRange 0-0)\n3. "openshift.io/veth-nondisruptive" (vethRange 0-1) + +dpu_daemon_dpu -> dpu_daemon_dpu: ListAndWatch()\n(advertise three resources) + +dpu_daemon_dpu -> kubelet_dpu: Register Device Plugin\nAll DPU resources +activate kubelet_dpu +kubelet_dpu -> dpu_daemon_dpu: Registration Accepted +kubelet_dpu -> kubelet_dpu: Add node capacity\nVF=7, RPM=1, veth=2 +deactivate kubelet_dpu + +deactivate dpu_daemon_dpu +deactivate configmap + +== BridgeID and NAD Generation (1 NAD per DpuNetwork CR) == + +dpu_network_controller -> dpu_network_controller: Create BridgeID + +dpu_network_controller -> dpu_network_controller: Create single NAD\nfor all VFs in network\n(shared config: IsDisruptive, IPAM) + +dpu_network_controller -> k8s_api: Create NetworkAttachmentDefinition +activate k8s_api +note right: **NAD 1 for DpuNetwork 1**\n\nmetadata:\n name: dpunetwork-1-nad\n namespace: default\n annotations:\n dpu.config.openshift.io/dpu-network: dpu-network-1\n k8s.v1.cni.cncf.io/resourceName: openshift.io/dpunetwork-dpu-network-1\nspec:\n config: {\n "type": "dpu-cni",\n "cniVersion": "0.4.0",\n "name": "dpu-cni",\n "BridgeID": "",\n "IsDisruptive": "true",\n "ipam": {...}\n }\n\n**VFs (0,1,2,3,5,6,7) use this NAD**\n**Multiple pods can use this NAD**\n**Each pod gets allocated a VF from the pool** +k8s_api -> dpu_network_controller: NAD Created +deactivate k8s_api + +note over dpu_network_controller: **About NRI (Network Resources Injector):**\nNRI webhook is installed once (via DpuOperatorConfig) and is not re-registered per DpuNetwork.\nDpuNetwork creation only needs to create NAD(s) and (optionally) publish a mapping (e.g., in DpuNetwork.status)\nso NRI can translate `dpu.config.openshift.io/dpu-network: ` into\n`k8s.v1.cni.cncf.io/networks: ` during Pod CREATE. + +dpu_network_controller -> k8s_api: Update DpuNetwork 1 Status +activate k8s_api +note right: status:\n conditions:\n - type: Ready\n status: True\n message: NAD and Device Plugin created\n resourceName: "openshift.io/dpunetwork-dpu-network-1"\n selectedVFs: [0,1,2,3,5,6,7]\n excludedVFs: [4] +k8s_api -> dpu_network_controller: Status Updated +deactivate k8s_api + +deactivate dpu_network_controller +deactivate k8s_api + +note over k8s_api: **Architecture Summary:**\n**Single ConfigMap, per-node device plugin instances**\n**- Host dpu-daemon filters node-role=host resources**\n**- DPU dpu-daemon filters node-role=dpu resources (VF+RPM+veth)**\n**- Each node runs exactly one device plugin instance**\n**- Entries share resourceName when devices overlap**\n**- NAD per DpuNetwork CR stays unchanged**\n\n**When new DpuNetwork CR created:**\n**- Controller updates ConfigMap with host + DPU entries**\n**- Both daemons detect change and reload in-place**\n**- No new pods/daemons required, only ListAndWatch updates** + +note right of user: **See:**\n- pod_creation_regular.puml for pod creation flow\n- pod_creation_nf_disruptive.puml for NF pod flow\n- dpunetwork_cr_update.puml for update flow\n- dpunetwork_cr_deletion.puml for deletion flow + +@enduml + diff --git a/doc/dpunetwork_cr_delete.puml b/doc/dpunetwork_cr_delete.puml new file mode 100644 index 000000000..00acb5ab5 --- /dev/null +++ b/doc/dpunetwork_cr_delete.puml @@ -0,0 +1,116 @@ +@startuml dpunetwork_cr_deletion + +actor user +box "Kubernetes Control Plane" +participant k8s_api +participant dpu_network_controller +participant configmap as "ConfigMap\\ndpu-device-plugin-config" +end box + +box "Host Node" +participant kubelet_host as "kubelet (Host)" +participant dpu_daemon_host as "dpu-daemon (Host)\\n(Device Plugin Manager + Device Plugin)" +participant vsp_host as "vsp (Host)" +end box + +box "DPU Node" +participant kubelet_dpu as "kubelet (DPU)" +participant dpu_daemon_dpu as "dpu-daemon (DPU)\\n(Device Plugin Manager + Device Plugin)" +participant vsp_dpu as "vsp (DPU)" +end box + +autonumber + +== DpuNetwork Deletion (ConfigMap Approach) == + +note right of user: **Prerequisites:**\nDpuNetwork CR already created\nSee: dpunetwork_cr_creation.puml + +user -> k8s_api: Delete DpuNetwork CR +activate k8s_api + +k8s_api -> dpu_network_controller: Reconcile Event (Deletion) +activate dpu_network_controller + +dpu_network_controller -> dpu_network_controller: Aggregate remaining DpuNetwork CRs\n(remove deleted network from list) + +dpu_network_controller -> dpu_network_controller: Generate updated ConfigMap data\n(remove network-1 resource definition) + +dpu_network_controller -> k8s_api: Update ConfigMap\ndpu-device-plugin-config +activate k8s_api +note right: **ConfigMap Updated**\n\nconfig.json updated:\n "resources": [\n // network-1 removed\n {\n "resourceName": "openshift.io/dpunetwork-dpu-network-2",\n ...\n }\n ] +k8s_api -> configmap: ConfigMap Updated +activate configmap +k8s_api -> dpu_network_controller: ConfigMap Updated +deactivate k8s_api + +== ConfigMap Change Propagates to Host and DPU Nodes == + +configmap -> dpu_daemon_host: ConfigMap Change Event\\n(resource removed) +activate dpu_daemon_host + +dpu_daemon_host -> k8s_api: Get Updated ConfigMap +activate k8s_api +k8s_api -> dpu_daemon_host: ConfigMap without network-1 +deactivate k8s_api + +dpu_daemon_host -> dpu_daemon_host: Parse config.json\\nDetect removed resource "openshift.io/dpunetwork-dpu-network-1" + +dpu_daemon_host -> dpu_daemon_host: Update single device plugin instance\\n(reload config, rebuild advertised list) + +dpu_daemon_host -> vsp_host: ReleaseHostVfPool(bridge_id="x1") +activate vsp_host +vsp_host -> vsp_host: Remove VF entries bound to host pods +vsp_host -> dpu_daemon_host: VF pool released +deactivate vsp_host + +dpu_daemon_host -> kubelet_host: ListAndWatch Update\\n(unregister CR-specific resource) +activate kubelet_host +kubelet_host -> kubelet_host: Remove node capacity\\n"openshift.io/dpunetwork-dpu-network-1" +kubelet_host -> dpu_daemon_host: Resource Removed +deactivate kubelet_host + +deactivate dpu_daemon_host + +configmap -> dpu_daemon_dpu: ConfigMap Change Event\\n(resource removed) +activate dpu_daemon_dpu + +dpu_daemon_dpu -> k8s_api: Get Updated ConfigMap +activate k8s_api +k8s_api -> dpu_daemon_dpu: ConfigMap without network-1 +deactivate k8s_api + +dpu_daemon_dpu -> dpu_daemon_dpu: Parse config.json\\nDetect removal of VF, RPM, veth entries for bridge "x1" + +dpu_daemon_dpu -> dpu_daemon_dpu: Update device plugin instance\\n(stop advertising VF resource, adjust RPM/Veth counts) + +dpu_daemon_dpu -> vsp_dpu: DeleteNetworkResources(bridge_id="x1") +activate vsp_dpu +vsp_dpu -> vsp_dpu: Tear down NF map entry\\nremove VF + RPM interfaces +vsp_dpu -> vsp_dpu: Delete flow rules and bridge br-x1 +vsp_dpu -> dpu_daemon_dpu: Network resources deleted +deactivate vsp_dpu + +dpu_daemon_dpu -> kubelet_dpu: ListAndWatch Update\\n(unregister VF resource, update RPM/veth counts) +activate kubelet_dpu +kubelet_dpu -> kubelet_dpu: Remove node capacity\\n"openshift.io/dpunetwork-dpu-network-1" on DPU node +kubelet_dpu -> dpu_daemon_dpu: Resource Removed +deactivate kubelet_dpu + +deactivate dpu_daemon_dpu +deactivate configmap + +dpu_network_controller -> k8s_api: Delete NetworkAttachmentDefinition +activate k8s_api +k8s_api -> dpu_network_controller: NAD Deleted +deactivate k8s_api + +dpu_network_controller -> k8s_api: Remove Finalizer +deactivate dpu_network_controller + +k8s_api -> k8s_api: DpuNetwork CR Deleted +deactivate k8s_api + +note right of user: **Related Diagrams:**\n- dpunetwork_cr_creation.puml (host setup)\n- dpunetwork_cr_creation-dpu.puml (DPU setup)\n- dpunetwork_cr_update.puml (update flow) + +@enduml + diff --git a/doc/dpunetwork_cr_update.puml b/doc/dpunetwork_cr_update.puml new file mode 100644 index 000000000..aa58464c3 --- /dev/null +++ b/doc/dpunetwork_cr_update.puml @@ -0,0 +1,119 @@ +@startuml dpunetwork_cr_update + +actor user +box "Kubernetes Control Plane" +participant k8s_api +participant dpu_network_controller +participant configmap as "ConfigMap\ndpu-device-plugin-config" +end box + +box "Host Node" +participant kubelet_host as "kubelet (Host)" +participant dpu_daemon_host as "dpu-daemon (Host)\n(Device Plugin Manager + Device Plugin)" +participant vsp_host as "vsp (Host)" +end box + +box "DPU Node" +participant kubelet_dpu as "kubelet (DPU)" +participant dpu_daemon_dpu as "dpu-daemon (DPU)\n(Device Plugin Manager + Device Plugin)" +participant vsp_dpu as "vsp (DPU)" +end box + +autonumber + +== DpuNetwork Update (ConfigMap Approach) == + +note right of user: **Prerequisites:**\nDpuNetwork CR already created\nSee: dpunetwork_cr_creation.puml + +user -> k8s_api: Update DpuNetwork CR\n(change vfId ranges: "0-2", "5-7") +activate k8s_api + +k8s_api -> dpu_network_controller: Reconcile Event +activate dpu_network_controller + +dpu_network_controller -> dpu_network_controller: Re-evaluate VF selection\n(new ranges: [0,1,2,5,6,7]) + +dpu_network_controller -> dpu_network_controller: Generate updated ConfigMap data\n(update network-1 resource definition) + +dpu_network_controller -> k8s_api: Update ConfigMap\ndpu-device-plugin-config +activate k8s_api +note right: **ConfigMap Updated**\n\nconfig.json updated:\n "resources": [\n {\n "resourceName": "openshift.io/dpunetwork-dpu-network-1",\n "vfRanges": ["0-2", "5-7"],\n ...\n }\n ] +k8s_api -> configmap: ConfigMap Updated +activate configmap +k8s_api -> dpu_network_controller: ConfigMap Updated +deactivate k8s_api + +== Host dpu-daemon Reloads ConfigMap == + +configmap -> dpu_daemon_host: ConfigMap Change Event\n(resource definition updated) +activate dpu_daemon_host + +dpu_daemon_host -> k8s_api: Get Updated ConfigMap +activate k8s_api +k8s_api -> dpu_daemon_host: ConfigMap with updated host entry +deactivate k8s_api + +dpu_daemon_host -> dpu_daemon_host: Parse config.json\nFilter node-role = host +note right: Host entry still only lists VF ranges\nOld: [0,1,2,3,5,6,7]\nNew: [0,1,2,5,6,7] (VF 3 removed) + +dpu_daemon_host -> vsp_host: GetDevices() +activate vsp_host +vsp_host -> vsp_host: Return host-visible VF devices +vsp_host -> dpu_daemon_host: Device inventory +deactivate vsp_host + +dpu_daemon_host -> dpu_daemon_host: Rebuild device list\nApply new vfRanges (remove VF 3) + +dpu_daemon_host -> dpu_daemon_host: Update ListAndWatch\n(advertise host resource only) + +dpu_daemon_host -> kubelet_host: ListAndWatch Update\n(device-3 removed from resource) +activate kubelet_host +kubelet_host -> kubelet_host: Update node capacity\n"openshift.io/dpunetwork-dpu-network-1": 6 (host) +deactivate kubelet_host + +deactivate dpu_daemon_host + +== DPU dpu-daemon Reloads ConfigMap == + +configmap -> dpu_daemon_dpu: ConfigMap Change Event\n(resource definition updated) +activate dpu_daemon_dpu + +dpu_daemon_dpu -> k8s_api: Get Updated ConfigMap +activate k8s_api +k8s_api -> dpu_daemon_dpu: ConfigMap with updated DPU entry +deactivate k8s_api + +dpu_daemon_dpu -> dpu_daemon_dpu: Parse config.json\nFilter node-role = dpu +note right: DPU entry mirrors host VF change\n(and may include RPM/veth adjustments) + +dpu_daemon_dpu -> vsp_dpu: GetDevices() +activate vsp_dpu +vsp_dpu -> vsp_dpu: Return VF/RPM/veth inventory +vsp_dpu -> dpu_daemon_dpu: Device inventory +deactivate vsp_dpu + +dpu_daemon_dpu -> dpu_daemon_dpu: Rebuild device lists\n- VF repr filtered with new ranges\n- RPM/veth unchanged (if not in diff) + +dpu_daemon_dpu -> dpu_daemon_dpu: Update ListAndWatch\n(advertise VF + RPM + veth) + +dpu_daemon_dpu -> kubelet_dpu: ListAndWatch Update\n(update capacities per resource) +activate kubelet_dpu +kubelet_dpu -> kubelet_dpu: Update node capacity\n"openshift.io/dpunetwork-dpu-network-1": 6 (DPU VF)\n(RPM/veth counts unchanged) +deactivate kubelet_dpu + +deactivate dpu_daemon_dpu +deactivate configmap + +dpu_network_controller -> k8s_api: Update NetworkAttachmentDefinition +activate k8s_api +k8s_api -> dpu_network_controller: NAD Updated +deactivate k8s_api + +dpu_network_controller -> k8s_api: Update DpuNetwork Status +deactivate dpu_network_controller +deactivate k8s_api + +note right of user: **Related Diagrams:**\n- dpunetwork_cr_creation.puml (initial setup)\n- dpunetwork_cr_deletion.puml (cleanup flow) + +@enduml + diff --git a/doc/nf_Create.pdf b/doc/nf_Create.pdf new file mode 100644 index 000000000..0f685aed8 Binary files /dev/null and b/doc/nf_Create.pdf differ diff --git a/doc/nf_create.puml b/doc/nf_create.puml new file mode 100644 index 000000000..8534bd1e1 --- /dev/null +++ b/doc/nf_create.puml @@ -0,0 +1,104 @@ +@startuml pod_creation_nf_disruptive + +actor user +box "Kubernetes Control Plane" +participant k8s_api +participant nri as "network-resources-injector (NRI)\nMutating Webhook" +end box + +box "DPU Node" +participant kubelet +participant dpu_daemon as "dpu-daemon\n(Device Plugin Manager + Device Plugin)" +participant vendor_plugin as "vsp (vendor plugin)" +end box + +autonumber + +== NF Pod Launch: BridgeID and IsDisruptive via NAD == + +note right of user: **Prerequisites:**\nDpuNetwork CR created and ready\nDevice Plugin registered\nNetwork Resources Injector (NRI) deployed (via DpuOperatorConfig)\nSee: dpunetwork_cr_creation.puml + +user -> k8s_api: Create NF Pod with DpuNetwork annotation +activate k8s_api +note right: **NF Pod YAML (Disruptive Mode)**\n\napiVersion: v1\nkind: Pod\nmetadata:\n name: "my-nf"\n namespace: openshift-dpu-operator\n annotations:\n dpu.config.openshift.io/dpu-network: dpu-network-1 # controller fills NAD list\nspec:\n nodeSelector:\n dpu.config.openshift.io/dpuside: "dpu"\n containers:\n - name: "my-nf"\n image: "ghcr.io/ovn-kubernetes/kubernetes-traffic-flow-tests:latest"\n resources:\n requests:\n openshift.io/dpunetwork-dpu-network-1: "7" # all VF representors from CR\n openshift.io/rpm-disruptive: "1" # 1 RPM when IsDisruptive=true\n limits:\n openshift.io/dpunetwork-dpu-network-1: "7"\n openshift.io/rpm-disruptive: "1"\n\n**Only DpuNetwork annotation provided by user**\n**Controller patches k8s.v1.cni.cncf.io/networks** + +== Admission (Mutating Webhook) == + +k8s_api -> nri: AdmissionReview (Pod CREATE)\ncheck dpu-network annotation + dpuside nodeSelector +activate nri +note right of nri: NRI logic (high-level):\n- Read DpuNetwork CR (dpu-network-1)\n- Determine side from nodeSelector\n dpuside=dpu → DPU-side injection\n- Determine required VF count (e.g., from resource request)\n- If IsDisruptive=true: inject VF NAD repeated N times + rpm-disruptive NAD\n- Else: inject single non-disruptive NAD +nri -> k8s_api: Patch Pod annotation\nk8s.v1.cni.cncf.io/networks = [dpunetwork-1-nad x7, rpm-disruptive] +deactivate nri + +note right of k8s_api: After mutation, the stored Pod has a NAD list.\nMultus will call CNI ADD once per NAD entry.\n(Repeating the VF NAD N times yields N CNI cmdADD calls.) + +k8s_api -> kubelet: Pod Scheduled to Node +activate kubelet + +kubelet -> kubelet: Check Resource Requests\n"openshift.io/dpunetwork-dpu-network-1": 7 (all VFs from CR)\n"openshift.io/rpm-disruptive": 1 (RPM port) + +kubelet -> dpu_daemon: Allocate(AllocateRequest)\nContainerRequests for 7 VF devices + 1 RPM device\nResource: "openshift.io/dpunetwork-dpu-network-1" (x7)\nResource: "openshift.io/rpm-disruptive" (x1) +activate dpu_daemon + +dpu_daemon -> dpu_daemon: Lookup resources in ConfigMap\n"openshift.io/dpunetwork-dpu-network-1"\n"openshift.io/rpm-disruptive" + +dpu_daemon -> dpu_daemon: Allocate devices from resource pool\nDevice-1 (VF 1)\nDevice-2 (VF 2)\nDevice-3 (VF 3)\nDevice-4 (VF 4)\nDevice-5 (VF 5)\nDevice-6 (VF 6)\nDevice-7 (VF 7)\nDevice-RPM (RPM interface) + +dpu_daemon -> kubelet: AllocateResponse\n(NF-DEV env var for all 8 devices) +deactivate dpu_daemon + +note right of kubelet: **NAD processing order**\n1. VF NAD (`dpunetwork-1`) triggers 7 CNI cmdADD calls (one per VF)\n2. rpm-disruptive NAD triggers 1 CNI cmdADD call for RPM port\n3. Total: 8 CNI cmdADD calls, each triggers a createNetworkFunction call to VSP + +== Multiple CNI cmdADD Calls == + +loop for each of 7 VF representors from CR (Device-1 to Device-7) +kubelet -> dpu_daemon: Execute dpu-cni (cmdADD)\nNAD config: dpunetwork-1\nVF index: 1..7 +activate dpu_daemon + +dpu_daemon -> dpu_daemon: cniCmdNfAddHandler\n(PodRequest with VF info) + +dpu_daemon -> dpu_daemon: Prepare NFRequest\nExtract BridgeID from DpuNetwork CR\nVF MAC address\nVF device info + +dpu_daemon -> vendor_plugin: CreateNetworkFunction(gRPC)\nNFRequest{\n input: "aa:bb:cc:dd:ee:0X" (VF MAC),\n output: "",\n bridge_id: "x1",\n is_disruptive: true,\n vf_index: X\n} +activate vendor_plugin + +vendor_plugin -> vendor_plugin: Validate BridgeID="x1"\nEntry in VSP Map:\n Key: "x1"\n Value: {\n pod_name: "my-nf",\n pod_uid: "abc-123-def-456",\n entries: [\n {vf_index: X, input_mac, device, ...}\n ]\n } + +vendor_plugin -> dpu_daemon: NetworkFunction Created +deactivate vendor_plugin + +dpu_daemon -> kubelet: CNI Success (VF) +deactivate dpu_daemon +end + +== RPM Port CNI cmdADD Call == + +kubelet -> dpu_daemon: Execute dpu-cni (cmdADD)\nNAD config: rpm-disruptive +activate dpu_daemon + +dpu_daemon -> dpu_daemon: cniCmdNfAddHandler\n(PodRequest with RPM info) +note right: **PodRequest**\n\nPodRequest:\n PodName: "my-nf"\n PodNamespace: "openshift-dpu-operator"\n PodUID: "abc-123-def-456"\n Netns: "/proc/1234/ns/net"\n CNIConf:\n BridgeID: "x1"\n IsDisruptive: true\n MAC: "aa:bb:cc:dd:ee:ff" (RPM MAC) + +dpu_daemon -> dpu_daemon: Check IsDisruptive flag\nIsDisruptive = true\nPrepare NFRequest for RPM + +dpu_daemon -> vendor_plugin: CreateNetworkFunction(gRPC)\nNFRequest{\n input: "aa:bb:cc:dd:ee:ff" (RPM MAC),\n output: "",\n bridge_id: "x1",\n is_disruptive: true,\n is_rpm: true\n} +activate vendor_plugin + +vendor_plugin -> vendor_plugin: Update VSP Map Entry\n Key: "x1"\n Value: {\n pod_name: "my-nf",\n pod_uid: "abc-123-def-456",\n entries: [\n {vf_index: 1, input_mac, device, ...},\n {vf_index: 2, input_mac, device, ...},\n ...\n {vf_index: 7, input_mac, device, ...},\n {is_rpm: true, input_mac: "aa:bb:cc:dd:ee:ff", ...}\n ]\n } + +vendor_plugin -> vendor_plugin: Configure Flow Rules for all entries\nBridge: br-x1\nAll VF interfaces + RPM interface\nDisruptive mode: traffic via single RPM + +vendor_plugin -> dpu_daemon: NetworkFunction Created +deactivate vendor_plugin + +dpu_daemon -> kubelet: CNI Success (RPM) +deactivate dpu_daemon + +kubelet -> k8s_api: Update Pod Status (Running) +deactivate kubelet +deactivate k8s_api + +note right of user: **VSP Map Structure (Keyed by BridgeID)**\n\nVSP Maintains:\n Map\n\n Key: "x1"\n Value: PodNetworkFunctionInfo {\n pod_name: "my-nf"\n pod_namespace: "openshift-dpu-operator"\n pod_uid: "abc-123-def-456"\n bridge_id: "x1"\n is_disruptive: true\n entries: [\n {\n type: "VF",\n vf_index: 1,\n input_mac: "aa:bb:cc:dd:ee:01",\n input_device: "0000:00:07.1",\n netns: "/proc/1234/ns/net"\n },\n ... (6 more VF entries) ...\n {\n type: "VF",\n vf_index: 7,\n input_mac: "aa:bb:cc:dd:ee:07",\n input_device: "0000:00:07.7",\n netns: "/proc/1234/ns/net"\n },\n {\n type: "RPM",\n input_mac: "aa:bb:cc:dd:ee:ff",\n input_device: "rpm-0\",\n netns: "/proc/1234/ns/net"\n }\n ]\n }\n\n**Summary:**\n- 8 total CNI cmdADD calls (7 VFs + 1 RPM)\n- 8 createNetworkFunction calls to VSP\n- 1 VSP map entry per pod, keyed by BridgeID\n- All NF interfaces (VFs + RPM) tracked in single entry\n\n**Related Diagrams:**\n- pod_creation_regular.puml for regular pods\n- dpunetwork_cr_creation.puml for CR setup + +@enduml + diff --git a/doc/pod_Create.pdf b/doc/pod_Create.pdf new file mode 100644 index 000000000..4839de792 Binary files /dev/null and b/doc/pod_Create.pdf differ diff --git a/doc/pod_create.puml b/doc/pod_create.puml new file mode 100644 index 000000000..ddce80892 --- /dev/null +++ b/doc/pod_create.puml @@ -0,0 +1,165 @@ +@startuml pod_creation_regular + +actor user +box "Kubernetes Control Plane" +participant k8s_api +participant nri as "network-resources-injector (NRI)\nMutating Webhook" +end box + +box "Host Node" +participant kubelet +participant device_plugin as "device_plugin" +participant cni_plugin +participant cni_server +participant host_side_manager +end box + +box "DPU Node" +participant dpu_side_manager +participant vendor_plugin +end box + +autonumber + +== Pod Creation Using Different DpuNetworks == + +note right of user: **Prerequisites:**\nDpuNetwork CR created and ready\nDevice Plugin registered\nNetwork Resources Injector (NRI) deployed (via DpuOperatorConfig)\nSee: dpunetwork_cr_creation.puml + +user -> k8s_api: Create Pod 1 with DpuNetwork annotation +activate k8s_api +note right: **Pod 1 → DpuNetwork 1**\nannotations:\n dpu.config.openshift.io/dpu-network: dpu-network-1\n\nresources:\n requests:\n openshift.io/dpunetwork-dpu-network-1: "1"\n\n**User references DpuNetwork; NRI injects Multus NAD annotation** + +== Admission (Mutating Webhook) == + +k8s_api -> nri: AdmissionReview (Pod CREATE)\ncheck dpu-network annotation +activate nri +nri -> k8s_api: Patch Pod annotation\nk8s.v1.cni.cncf.io/networks = default/dpunetwork-1-nad +deactivate nri + +note right of k8s_api: After mutation, the stored Pod has\nk8s.v1.cni.cncf.io/networks pointing to the NAD\nso Multus will act on it. + +k8s_api -> kubelet: Pod Scheduled to Node +activate kubelet + +kubelet -> kubelet: Check Resource Request\n"openshift.io/dpunetwork-dpu-network-1": 1 + +kubelet -> device_plugin: Allocate(AllocateRequest)\nResource: "openshift.io/dpunetwork-dpu-network-1" +activate device_plugin +note right: ContainerRequests:\n DevicesIDs: ["device-2"]\n (Kubelet selects from\n available devices in pool) + +device_plugin -> device_plugin: Lookup resource in ConfigMap\n"openshift.io/dpunetwork-dpu-network-1" + +device_plugin -> device_plugin: Validate DeviceID\n(device-2 maps to VF 2) + +device_plugin -> device_plugin: Check VF in allowed range\n(VF 2 is in [0,1,2,3,5,6,7] ✓)\n(from ConfigMap resource definition) + +device_plugin -> device_plugin: Allocate device\nSet NF-DEV env var\nNF-DEV="0000:00:07.2" +device_plugin -> kubelet: AllocateResponse\n(NF-DEV env var) +deactivate device_plugin + +kubelet -> cni_plugin: Execute dpu-cni\n(NAD config with BridgeID & IsDisruptive) +activate cni_plugin + +cni_plugin -> cni_server: POST /cni\n(NAD config) +activate cni_server + +cni_server -> host_side_manager: cniCmdAddHandler\n(PodRequest with BridgeID & IsDisruptive) +activate host_side_manager + +== Bridge Port Creation == + +host_side_manager -> dpu_side_manager: CreateBridgePort(gRPC{BridgePortID,IsDisruptive,Bridge}) +activate dpu_side_manager +note right: CreateBridgePortRequest:\n BridgePortID: "x1"\n IsDisruptive:true BridgePort:\n Name: "host0-1"\n Spec:\n MacAddress: "..."\n + +dpu_side_manager -> vendor_plugin: CreateBridgePort(OPI API) +activate vendor_plugin + +vendor_plugin -> vendor_plugin: Check IsDisruptive flag\nfrom CreateBridgePortRequest + +alt IsDisruptive = true + vendor_plugin -> vendor_plugin: Lookup NF in local registry\nby BridgeID (BridgePortID="x1") + + alt NF not found in registry + vendor_plugin -> dpu_side_manager: Error: NF not created yet\nPlease use non-Disruptive mode to Host + deactivate vendor_plugin + dpu_side_manager -> host_side_manager: Error Response + deactivate dpu_side_manager + host_side_manager -> cni_server: CNI Error + deactivate host_side_manager + cni_server -> cni_plugin: HTTP Error Response + deactivate cni_server + cni_plugin -> kubelet: CNI Error + deactivate cni_plugin + kubelet -> k8s_api: Update Pod Status (Failed) + deactivate kubelet + deactivate k8s_api + note right: **Error Flow:**\nNF must be created first\nusing CreateNetworkFunction\nbefore host side can connect\nin disruptive mode + else NF found in registry + vendor_plugin -> vendor_plugin: Fetch PodNetworkFunctionInfo from VSP map\nKey = BridgePortID "x1" + note right: **VSP Map Structure (Keyed by BridgeID)**\n\nVSP Maintains:\nMap\n\nKey: "x1"\nValue: PodNetworkFunctionInfo {\n pod_name: "my-nf"\n pod_namespace: "openshift-dpu-operator"\n pod_uid: "abc-123-def-456"\n bridge_id: "x1"\n is_disruptive: true\n entries: [\n {type: "VF", vf_index: 1, input_mac: "aa:bb:cc:dd:ee:01", input_device: "0000:00:07.1", netns: "/proc/1234/ns/net"},\n ...\n {type: "VF", vf_index: 7, input_mac: "aa:bb:cc:dd:ee:07", input_device: "0000:00:07.7", netns: "/proc/1234/ns/net"},\n {type: "RPM", input_mac: "aa:bb:cc:dd:ee:ff", input_device: "rpm-0", netns: "/proc/1234/ns/net"}\n ]\n} + vendor_plugin -> vendor_plugin: Determine vf_index from host VF\n(using NF-DEV mapping) + vendor_plugin -> vendor_plugin: Check VF entry exists in NF map\n(BridgeID "x1", vf_index) + alt VF entry exists in NF map + vendor_plugin -> dpu_side_manager: BridgePort Verified\n(VF already part of NF map) + deactivate vendor_plugin + + dpu_side_manager -> host_side_manager: BridgePort Response + deactivate dpu_side_manager + + host_side_manager -> cni_server: CNI Result + deactivate host_side_manager + + cni_server -> cni_plugin: HTTP Response + deactivate cni_server + + cni_plugin -> kubelet: CNI Success + deactivate cni_plugin + else VF entry missing in NF map + vendor_plugin -> dpu_side_manager: Error: Device not added to NF (Disruptive) + deactivate vendor_plugin + dpu_side_manager -> host_side_manager: Error Response + deactivate dpu_side_manager + host_side_manager -> cni_server: CNI Error + deactivate host_side_manager + cni_server -> cni_plugin: HTTP Error Response + deactivate cni_server + cni_plugin -> kubelet: CNI Error + deactivate cni_plugin + kubelet -> k8s_api: Update Pod Status (Failed) + deactivate kubelet + deactivate k8s_api + note right: **Error Flow:**\nHost VF must match an entry\ncreated during NF registration + end + end +else IsDisruptive = false (Non-Disruptive Mode) + vendor_plugin -> vendor_plugin: Select/Create bridge\nbr-x1 (from bridge_id) + + vendor_plugin -> vendor_plugin: AddPortToDataPlane\n(bridge=br-x1, vfName) + + vendor_plugin -> vendor_plugin: AddFlowRuleToDataPlane\n(bridge=br-x1, ...) + + vendor_plugin -> dpu_side_manager: BridgePort Created + deactivate vendor_plugin + + dpu_side_manager -> host_side_manager: BridgePort Response + deactivate dpu_side_manager + + host_side_manager -> cni_server: CNI Result + deactivate host_side_manager + + cni_server -> cni_plugin: HTTP Response + deactivate cni_server + + cni_plugin -> kubelet: CNI Success + deactivate cni_plugin +end + +kubelet -> k8s_api: Update Pod 1 Status (Running) +deactivate kubelet +deactivate k8s_api + +note right of user: **Related Diagrams:**\n- pod_creation_nf_disruptive.puml for NF pods\n- dpunetwork_cr_creation.puml for CR setup + +@enduml +