Skip to content

Conversation

@zhiqiangf
Copy link

@zhiqiangf zhiqiangf commented Dec 5, 2025

Add comprehensive SR-IOV environment helper functions:

  • Network and policy management (Create, Remove, Wait functions)
  • VF initialization for standard and DPDK modes
  • Pod creation and verification utilities
  • Cluster stability checks (MCP, SR-IOV operator status)
  • Traffic verification helpers

All functions use eco-goinfra builders and return errors
instead of using Gomega assertions (as per framework rules).

Summary by CodeRabbit

  • Tests
    • Enhanced SR‑IOV test environment with operator/status checks, network/NAD lifecycle, policy management, VF (including DPDK) initialization, pod-based utilities, interface/PCI checks, and cleanup.
    • Added device configuration support, environment override parsing for devices, VF count handling, and a default device list (vf_num set).
    • New network option builders (VLAN, QoS, min/max tx rate, link state) and policy/MTU helpers.
    • Expanded timeouts, intervals and networking constants; removed one legacy timeout constant.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 5, 2025

📝 Walkthrough

Walkthrough

Adds SR‑IOV device configuration parsing and VF count handling, a new comprehensive SR‑IOV test environment module (network/NAD/policy/VF lifecycle, pod helpers, discovery, and cleanup), many test timing/network constants, and a small tsparams cleanup (removed DefaultTimeout/import).

Changes

Cohort / File(s) Change Summary
Configuration & defaults
tests/ocp/sriov/internal/ocpsriovconfig/config.go, tests/ocp/sriov/internal/ocpsriovconfig/default.yaml
Added DeviceConfig type and Devices, DevicesEnv, VFNum fields; added GetSriovDevices(), parseSriovDevicesEnv(), GetVFNum(); adjusted PathToDefaultOcpSriovParamsFile; populated default.yaml with vf_num and device entries.
SR‑IOV test environment
tests/ocp/sriov/internal/sriovenv/sriovenv.go
New large module implementing operator/MCP checks, SR‑IOV policy creation/removal, NAD/network creation and wait logic, VF init (netdevice/DPDK), interface discovery, pod test helpers, verification utilities, PCI/address helpers, and cleanup; plus NetworkOption builders.
Test constants
tests/ocp/sriov/internal/tsparams/consts.go
Added many exported timeouts/intervals, test MACs/IPs, DefaultTestMTU, VLAN/QoS and rate constants, vendor ID, and test resource label constants.
tsparams cleanup
tests/ocp/sriov/internal/tsparams/ocpsriovvars.go
Removed DefaultTimeout and time import; updated package-level comment for labels.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested reviewers

  • evgenLevin
  • ajaggapa
  • gkopels

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'ocp-sriov: add sriovenv helper functions' clearly and concisely describes the primary change: introducing new helper functions for SR-IOV environment management. It directly aligns with the main changeset objectives.
Docstring Coverage ✅ Passed Docstring coverage is 97.37% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@zhiqiangf zhiqiangf force-pushed the pr2-sriov-sriovenv branch 6 times, most recently from bfb7a3b to 7e0a1c5 Compare December 5, 2025 17:30
@zhiqiangf zhiqiangf marked this pull request as ready for review December 5, 2025 17:44
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/ocp/internal/ocpconfig/config.go (1)

77-84: Handle io.EOF for empty config files.

yaml.Decoder.Decode returns io.EOF when the file is empty. Since the GeneralConfig preservation logic (lines 38-55) is designed to handle empty files, this should not be treated as a fatal error.

+import "io"
+
 func readFile(ocpConfig *OcpConfig, cfgFile string) error {
 	openedCfgFile, err := os.Open(cfgFile)
 	if err != nil {
 		return err
 	}

 	defer func() {
 		_ = openedCfgFile.Close()
 	}()

 	decoder := yaml.NewDecoder(openedCfgFile)

 	err = decoder.Decode(ocpConfig)
-	if err != nil {
+	if err != nil && err != io.EOF {
 		return err
 	}

 	return nil
 }
🧹 Nitpick comments (4)
tests/ocp/sriov/internal/tsparams/sriovvars.go (2)

65-71: Default SupportsMinTxRate: true may not be accurate for all devices.

When users specify devices via SRIOV_DEVICES, defaulting SupportsMinTxRate to true could cause test failures for devices that don't support it. Consider adding an optional 5th field to the format (e.g., name:deviceid:vendor:interface:minTxRate).


85-89: Consider returning error instead of panic for invalid config.

Panicking in configuration parsing can crash the entire test suite. Returning an error would allow callers to handle this more gracefully (e.g., skip tests that require device config).

-// GetDeviceConfig returns device configuration from environment variable or defaults.
-func GetDeviceConfig() []DeviceConfig {
+// GetDeviceConfig returns device configuration from environment variable or defaults.
+// Returns error if SRIOV_DEVICES is set but contains no valid entries.
+func GetDeviceConfig() ([]DeviceConfig, error) {
 	envDevices := os.Getenv("SRIOV_DEVICES")

 	if devices := parseDeviceConfig(); len(devices) > 0 {
-		return devices
+		return devices, nil
 	}

 	if envDevices != "" {
-		panic(fmt.Sprintf(
+		return nil, fmt.Errorf(
 			"SRIOV_DEVICES is set to %q but no valid entries could be parsed; "+
-				"expected format: name:deviceid:vendor:interface", envDevices))
+				"expected format: name:deviceid:vendor:interface", envDevices)
 	}

-	return GetDefaultDeviceConfig()
+	return GetDefaultDeviceConfig(), nil
 }
tests/ocp/sriov/internal/tsparams/consts.go (1)

59-62: MAC addresses are not locally administered.

The test MAC addresses 20:04:0f:f1:88:01 and 20:04:0f:f1:88:02 are not locally administered (the second least significant bit of the first octet should be 1, e.g., 22:04:...). While this likely won't cause issues in isolated test environments, using locally administered MACs is best practice to avoid potential conflicts.

-	// TestPodClientMAC is the default MAC address for test client pods.
-	TestPodClientMAC = "20:04:0f:f1:88:01"
-	// TestPodServerMAC is the default MAC address for test server pods.
-	TestPodServerMAC = "20:04:0f:f1:88:02"
+	// TestPodClientMAC is the default MAC address for test client pods (locally administered).
+	TestPodClientMAC = "02:04:0f:f1:88:01"
+	// TestPodServerMAC is the default MAC address for test server pods (locally administered).
+	TestPodServerMAC = "02:04:0f:f1:88:02"
tests/ocp/sriov/internal/sriovenv/sriovenv.go (1)

21-21: Dot import for ocpsriovinittools.

Using dot imports (. "...") can reduce code clarity by making it unclear where symbols like APIClient and SriovOcpConfig originate. Consider using a named import for better readability, unless this is an established pattern in the codebase.

-	. "github.com/rh-ecosystem-edge/eco-gotests/tests/ocp/sriov/internal/ocpsriovinittools"
+	"github.com/rh-ecosystem-edge/eco-gotests/tests/ocp/sriov/internal/ocpsriovinittools"

Then update usages to ocpsriovinittools.APIClient, ocpsriovinittools.SriovOcpConfig, etc.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b1c772b and 7e0a1c52190b13670ccdebc28da2e89907a87109.

📒 Files selected for processing (6)
  • tests/ocp/internal/ocpconfig/config.go (2 hunks)
  • tests/ocp/sriov/internal/ocpsriovconfig/config.go (4 hunks)
  • tests/ocp/sriov/internal/ocpsriovconfig/default.yaml (0 hunks)
  • tests/ocp/sriov/internal/sriovenv/sriovenv.go (1 hunks)
  • tests/ocp/sriov/internal/tsparams/consts.go (1 hunks)
  • tests/ocp/sriov/internal/tsparams/sriovvars.go (1 hunks)
💤 Files with no reviewable changes (1)
  • tests/ocp/sriov/internal/ocpsriovconfig/default.yaml
🧰 Additional context used
🧬 Code graph analysis (1)
tests/ocp/sriov/internal/ocpsriovconfig/config.go (1)
tests/ocp/sriov/internal/ocpsriovinittools/ocpsriovinittools.go (1)
  • SriovOcpConfig (13-13)
🔇 Additional comments (10)
tests/ocp/internal/ocpconfig/config.go (1)

38-55: Good defensive pattern for preserving embedded config during YAML decode.

The snapshot-and-restore approach correctly handles the case where yaml.v2 zeros out embedded pointer fields when decoding empty or incomplete YAML files.

tests/ocp/sriov/internal/ocpsriovconfig/config.go (3)

28-41: Well-structured decoupling of SRIOV-specific config fields.

Using separate temporary structs for YAML and environment decoding prevents side effects on embedded OcpConfig/GeneralConfig. This is a clean separation of concerns.


57-62: Good addition of runtime.Caller validation.

Checking the ok return value prevents potential issues from using an empty or invalid filename path.


122-133: Intentional non-empty check prevents clearing YAML values via empty env vars.

The conditional overrides only apply non-empty environment values, preserving YAML defaults. This is typically the desired behavior but worth noting: setting ECO_OCP_SRIOV_OPERATOR_NAMESPACE="" won't clear a YAML-configured value.

tests/ocp/sriov/internal/tsparams/sriovvars.go (2)

24-35: Hardcoded default device configurations.

These defaults provide reasonable fallbacks for common SR-IOV NICs. The SupportsMinTxRate field correctly identifies devices that lack this capability.


94-113: LGTM! Well-structured VF number retrieval with validation.

The function correctly prioritizes env vars, validates parsed values, warns on invalid input, and provides a sensible default.

tests/ocp/sriov/internal/tsparams/consts.go (1)

14-49: Well-organized timeout and interval constants.

The constants are clearly documented and provide sensible defaults for various operations. The separation between different operation types (namespace, pod, policy, cleanup) improves code clarity.

tests/ocp/sriov/internal/sriovenv/sriovenv.go (3)

167-177: Good idempotent cleanup behavior.

Returning nil when the network doesn't exist ensures cleanup operations are idempotent and don't fail unnecessarily.


215-266: LGTM! Robust node fallback pattern for VF initialization.

The function correctly iterates over worker nodes, attempting policy creation on each until one succeeds. Cleanup of failed policies and logging at V(90) provides debugging capability without noise.


628-667: Good debug pod pattern for node-level command execution.

The function correctly creates a privileged debug pod with host namespaces, uses nsenter for host access, and ensures cleanup via defer. The pod name truncation to 63 characters ensures DNS compliance.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/ocp/internal/ocpconfig/config.go (1)

38-55: Pointer preservation looks good; consider treating empty YAML (io.EOF) as non-fatal.

Capturing generalConfig and restoring it when ocpConf.GeneralConfig comes back nil is a sensible safeguard against YAML decoding wiping the embedded GeneralConfig pointer.

One remaining edge case: readFile still propagates any Decode error, including io.EOF for an empty config file. If an empty default.yaml is meant to be valid (i.e., "just use defaults/env"), NewOcpConfig will still fail and return nil even though the pointer-preservation logic would otherwise be sufficient.

You can keep the new behavior and make empty YAMLs safe with something like:

 func readFile(ocpConfig *OcpConfig, cfgFile string) error {
@@
-	err = decoder.Decode(ocpConfig)
-	if err != nil {
-		return err
-	}
+	err = decoder.Decode(ocpConfig)
+	if err != nil && err != io.EOF {
+		return err
+	}
 
 	return nil
 }

(and add io to the imports). This aligns with the intent of preserving GeneralConfig when there is no YAML content at all.

Also applies to: 77-82

♻️ Duplicate comments (1)
tests/ocp/sriov/internal/ocpsriovconfig/config.go (1)

94-107: Still treat empty YAML (io.EOF) as non-fatal when decoding SR-IOV defaults.

The temporary sriovYAMLConfig and copy-back pattern are good, but decoder.Decode(&yamlConf) will still return io.EOF for an empty default.yaml, which currently bubbles up as an error and causes NewSriovOcpConfig to return nil. Given SR-IOV-specific fields are optional and may be provided via env, an empty YAML file should likely be acceptable.

You can align this with the earlier review suggestion by special-casing io.EOF:

 func readFile(sriovOcpConfig *SriovOcpConfig, cfgFile string) error {
@@
-	err = decoder.Decode(&yamlConf)
-	if err != nil {
-		return err
-	}
+	err = decoder.Decode(&yamlConf)
+	if err != nil && err != io.EOF {
+		return err
+	}
@@
 	// Copy decoded values to the actual config
 	sriovOcpConfig.OcpSriovOperatorNamespace = yamlConf.OcpSriovOperatorNamespace
 	sriovOcpConfig.OcpSriovTestContainer = yamlConf.OcpSriovTestContainer

(and add io to the imports). This lets you keep YAML entirely empty while still loading defaults and env-based overrides.

🧹 Nitpick comments (2)
tests/ocp/sriov/internal/ocpsriovconfig/config.go (1)

112-134: Env-only struct cleanly isolates SR-IOV overrides; minor duplication trade-off.

Decoding into sriovEnvConfig and then conditionally copying non-empty values onto SriovOcpConfig neatly avoids clobbering embedded config fields. The string-based checks are straightforward for these keys.

Longer term, consider whether you want to centralize the env tags (they now exist on both SriovOcpConfig and sriovEnvConfig) to avoid drift if variable names change, but that’s a minor maintenance concern and not a blocker.

tests/ocp/sriov/internal/tsparams/consts.go (1)

1-71: Timeouts and test network defaults look consistent; watch out for mutating Labels.

The consolidated timeout/polling constants and the test IP/MAC/MTU defaults are well-structured and self-documenting, making SR-IOV tests easier to tune and reason about. The MAC choices as locally administered unicast addresses are appropriate for test usage.

For Labels, exposing a mutable slice is slightly risky if any test code does:

tsparams.Labels = append(tsparams.Labels, tsparams.LabelBasic)

rather than working on a copy; that could introduce implicit coupling between test files and potential data races under go test -parallel. If that’s a concern, you might prefer a helper like:

func SuiteLabels() []string {
	return []string{LabelSuite}
}

and have each test file append to the returned slice, or at least treat Labels as read-only by convention. Not a blocker, just something to keep in mind.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7e0a1c52190b13670ccdebc28da2e89907a87109 and 87103d4beb0262b6bd9128f2a8b557ba9af63faa.

📒 Files selected for processing (6)
  • tests/ocp/internal/ocpconfig/config.go (2 hunks)
  • tests/ocp/sriov/internal/ocpsriovconfig/config.go (4 hunks)
  • tests/ocp/sriov/internal/ocpsriovconfig/default.yaml (0 hunks)
  • tests/ocp/sriov/internal/sriovenv/sriovenv.go (1 hunks)
  • tests/ocp/sriov/internal/tsparams/consts.go (1 hunks)
  • tests/ocp/sriov/internal/tsparams/sriovvars.go (1 hunks)
💤 Files with no reviewable changes (1)
  • tests/ocp/sriov/internal/ocpsriovconfig/default.yaml
✅ Files skipped from review due to trivial changes (1)
  • tests/ocp/sriov/internal/tsparams/sriovvars.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/ocp/sriov/internal/sriovenv/sriovenv.go
🧰 Additional context used
🧬 Code graph analysis (1)
tests/ocp/sriov/internal/ocpsriovconfig/config.go (1)
tests/ocp/sriov/internal/ocpsriovinittools/ocpsriovinittools.go (1)
  • SriovOcpConfig (13-13)
🔇 Additional comments (2)
tests/ocp/sriov/internal/ocpsriovconfig/config.go (2)

1-42: Good separation of SR-IOV-specific config from embedded OcpConfig.

The package doc plus the new sriovYAMLConfig / sriovEnvConfig types clearly scope YAML/env handling to SR-IOV fields only, avoiding side effects on the embedded OcpConfig/GeneralConfig. This matches the intent of keeping general OCP config loading independent of SR-IOV defaults.


57-62: runtime.Caller guard improves robustness of config path resolution.

Checking the ok return from runtime.Caller and failing early if it’s false is a reasonable defensive improvement for locating default.yaml; the log message is sufficient for debugging in this context.

@zhiqiangf zhiqiangf force-pushed the pr2-sriov-sriovenv branch 8 times, most recently from 2e589ec to 3aadda1 Compare December 11, 2025 15:43
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
tests/ocp/sriov/internal/ocpsriovconfig/config.go (1)

201-204: Handle io.EOF for empty YAML files.

When decoding an empty YAML file, yaml.Decoder.Decode returns io.EOF. Since the function sets defaults before reading config files and SR-IOV-specific fields are optional, an empty YAML file should not cause initialization to fail.

Apply this diff to handle empty files gracefully:

+import "io"
+
 	err = decoder.Decode(&yamlConf)
-	if err != nil {
+	if err != nil && err != io.EOF {
 		return err
 	}
🧹 Nitpick comments (2)
tests/ocp/sriov/internal/ocpsriovconfig/config.go (1)

255-306: Consider simplifying warning message format.

The function correctly parses the legacy SRIOV_DEVICES format with appropriate validation. However, line 276 uses %q which will add extra quotes around the already-quoted entry in the output.

Apply this diff for cleaner log output:

-			klog.Warningf("Skipping invalid SRIOV_DEVICES entry %q - expected format: name:deviceid:vendor:interface[:minTxRate]",
-				entry)
+			klog.Warningf("Skipping invalid SRIOV_DEVICES entry %s - expected format: name:deviceid:vendor:interface[:minTxRate]",
+				entry)
tests/ocp/sriov/internal/sriovenv/sriovenv.go (1)

460-472: Consider more robust link state checking.

The VerifyInterfaceReady function checks for "UP" in the output but doesn't verify the interface is actually in an operational state. For test environments this is likely sufficient, but a more robust check could parse the specific link state flags.

If stricter verification is needed in the future, consider parsing the state field:

// Enhanced version that checks for both UP and LOWER_UP flags
output, err := podObj.ExecCommand([]string{"ip", "link", "show", interfaceName})
if err != nil {
    return fmt.Errorf("failed to check interface %q: %w", interfaceName, err)
}

if !strings.Contains(output.String(), "UP") || !strings.Contains(output.String(), "LOWER_UP") {
    return fmt.Errorf("interface %q is not fully operational", interfaceName)
}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9041e6c9ac6070d586b5b0fa67cb605b80576c9c and 3aadda1a59b63d949ccf6eda2279a4a8026450ba.

📒 Files selected for processing (5)
  • tests/ocp/sriov/internal/ocpsriovconfig/config.go (4 hunks)
  • tests/ocp/sriov/internal/ocpsriovconfig/default.yaml (1 hunks)
  • tests/ocp/sriov/internal/sriovenv/sriovenv.go (1 hunks)
  • tests/ocp/sriov/internal/tsparams/consts.go (2 hunks)
  • tests/ocp/sriov/internal/tsparams/ocpsriovvars.go (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/ocp/sriov/internal/ocpsriovconfig/default.yaml
🧰 Additional context used
🧬 Code graph analysis (2)
tests/ocp/sriov/internal/ocpsriovconfig/config.go (2)
tests/ocp/internal/ocpconfig/config.go (1)
  • OcpConfig (20-22)
tests/ocp/sriov/internal/ocpsriovinittools/ocpsriovinittools.go (1)
  • SriovOcpConfig (13-13)
tests/ocp/sriov/internal/tsparams/ocpsriovvars.go (1)
tests/ocp/sriov/internal/tsparams/consts.go (2)
  • LabelSuite (10-10)
  • LabelBasic (15-15)
🔇 Additional comments (19)
tests/ocp/sriov/internal/tsparams/ocpsriovvars.go (1)

1-17: LGTM!

The reorganization cleanly consolidates timing constants into consts.go and extends the test labels to include LabelBasic. The changes align well with the PR objectives.

tests/ocp/sriov/internal/tsparams/consts.go (3)

14-15: LGTM!

The LabelBasic constant follows the established pattern and enables filtering for basic test cases.


21-44: LGTM!

The timeout values are well-suited for SR-IOV operations. The PolicyApplicationTimeout of 35 minutes is appropriate given that SR-IOV policy changes can trigger MachineConfigPool updates and node reboots.


46-56: LGTM!

The test pod networking constants are well-chosen. The locally administered MAC addresses and private IP range are appropriate for isolated test environments, and the 9000 MTU is standard for SR-IOV performance testing.

tests/ocp/sriov/internal/ocpsriovconfig/config.go (6)

26-88: LGTM!

The DeviceConfig type and default device list provide comprehensive coverage of major SR-IOV NIC vendors. The SupportsMinTxRate field correctly identifies device capabilities, which is important for rate-limiting test scenarios.


45-74: Excellent documentation of the temporary struct pattern.

The extensive comment block clearly explains why the temporary struct approach is necessary to avoid yaml.v2 zeroing embedded pointers. This pattern prevents configuration clobbering and is well-justified.


90-139: LGTM!

The initialization sequence establishes proper configuration precedence: defaults → YAML file → environment variables → legacy environment variables. The enhanced error logging with context improves debuggability.


141-157: LGTM!

The accessor methods provide sensible fallbacks to default values when configuration is not explicitly set, following Go best practices.


226-253: LGTM!

The readEnv function correctly uses the temporary struct pattern to preserve parent config fields, consistent with the readFile implementation.


308-328: LGTM!

The parseLegacyVFNum function correctly implements fallback to the legacy SRIOV_VF_NUM environment variable only when the new ECO_OCP_SRIOV_VF_NUM is not set. The validation ensures positive values and logs warnings for invalid input without failing initialization.

tests/ocp/sriov/internal/sriovenv/sriovenv.go (9)

40-81: LGTM!

The operator and cluster stability checks properly poll for SR-IOV node state synchronization and MachineConfigPool readiness. The error handling correctly continues polling on transient errors.


87-177: LGTM!

The network creation helpers use the functional options pattern effectively, providing a clean API for SR-IOV network configuration. The defensive check in RemoveSriovNetwork (line 173) correctly treats an already-deleted network as success.


183-203: LGTM!

The RemoveSriovPolicy function correctly waits for policy deletion by polling until the policy can no longer be retrieved. The defensive check for already-deleted policies prevents spurious errors.


205-279: LGTM with documented limitation.

The VF initialization functions correctly create SR-IOV policies on the first node where the device is found. The function comments (lines 206-208, 214-216) explicitly document that this is single-node behavior suitable for testing scenarios. The interface discovery logic (lines 239-243) appropriately allows explicit interface specification for multi-port NICs.


362-454: LGTM!

The pod creation and readiness functions correctly handle SR-IOV network attachment and wait for complete readiness (both phase and container statuses). The defensive check in DeleteDpdkTestPod prevents errors when pods are already deleted.


509-638: LGTM!

The traffic verification functions correctly handle the complete lifecycle: pod creation, interface verification, connectivity testing, and cleanup. The VerifyLinkStateConfiguration function appropriately returns (false, nil) on carrier timeout (line 634), allowing tests to skip scenarios where physical connections are not established rather than failing.


644-722: LGTM!

The spoof check verification correctly uses a debug pod with host network and PID namespaces to inspect VF settings on the node. The cleanup logic (lines 690-692, 707) ensures debug pods don't accumulate. The nsenter command construction (line 714) properly enters all necessary namespaces.


728-765: LGTM!

The GetPciAddress function correctly parses the CNI network status annotation to extract PCI address information. The logic to strip the namespace prefix (lines 754-757) ensures correct matching for namespaced network names.


771-850: LGTM!

The cleanup functions implement best-effort cleanup with appropriate error handling. Errors are logged (lines 781, 789, 799, 809, 817, 828) but don't halt the cleanup process, ensuring maximum cleanup even when some operations fail. The regex pattern (line 802) correctly identifies test resources without affecting production networks.

@zhiqiangf zhiqiangf force-pushed the pr2-sriov-sriovenv branch 3 times, most recently from 101d22f to f4504af Compare December 12, 2025 18:51
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
tests/ocp/sriov/internal/ocpsriovconfig/config.go (1)

162-170: Handle io.EOF for empty YAML files.

yaml.Decoder.Decode returns io.EOF for empty files. Since SR-IOV fields have defaults via GetDefaultDevices() and GetVFNum(), an empty YAML file should not be a fatal error.

+import "io"
+
 func readFile(sriovOcpConfig *SriovOcpConfig, cfgFile string) error {
 	// ... file open code ...
 
 	err = decoder.Decode(&sriovOcpConfig)
-	if err != nil {
+	if err != nil && err != io.EOF {
 		return err
 	}
 
 	return nil
 }
tests/ocp/sriov/internal/sriovenv/sriovenv.go (1)

303-356: Document potential service disruption.

The delete-and-recreate approach causes VFs to be deallocated and reallocated, resulting in brief service disruption. Add a note to help users decide when to use this function.

 // UpdateSriovPolicyMTU updates the MTU of an existing SR-IOV policy.
 // PolicyBuilder does not have Update method, so we delete and recreate.
+// Note: This causes brief service disruption as VFs are deallocated and reallocated.
+// Consider scheduling this operation during maintenance windows.
 func UpdateSriovPolicyMTU(policyName string, mtuValue int) error {
🧹 Nitpick comments (4)
tests/ocp/sriov/internal/tsparams/consts.go (1)

50-53: Duplicate IP constants with ocpsriovvars.go.

TestPodClientIP and TestPodServerIP duplicate ClientIPv4IPAddress and ServerIPv4IPAddress from ocpsriovvars.go. Consider consolidating to avoid divergence.

Either remove the duplicates here or in ocpsriovvars.go, and use a single source of truth:

-	// TestPodClientIP is the default IP address for test client pods.
-	TestPodClientIP = "192.168.0.1/24"
-	// TestPodServerIP is the default IP address for test server pods.
-	TestPodServerIP = "192.168.0.2/24"

Then update references to use ClientIPv4IPAddress and ServerIPv4IPAddress from ocpsriovvars.go.

tests/ocp/sriov/internal/sriovenv/sriovenv.go (3)

167-177: Error handling may mask actual failures.

PullNetwork could return errors other than "not found" (e.g., network issues, permission errors). Swallowing all errors with return nil may hide actual failures.

 func RemoveSriovNetwork(name string, timeout time.Duration) error {
 	sriovOpNs := SriovOcpConfig.OcpSriovOperatorNamespace
 
 	network, err := sriov.PullNetwork(APIClient, name, sriovOpNs)
 	if err != nil {
-		return nil // Already deleted
+		// Check if it's a "not found" error vs other errors
+		if strings.Contains(err.Error(), "not found") {
+			return nil // Already deleted
+		}
+		return fmt.Errorf("failed to pull network %q: %w", name, err)
 	}
 
 	return network.DeleteAndWait(timeout)
 }

183-203: Same error handling concern as RemoveSriovNetwork.

Consider distinguishing "not found" from other errors for better debuggability.


683-722: Potential pod name collision with long node names.

When truncating to 63 characters, different node names could produce the same pod name if they share the first ~50 characters. This is unlikely in practice but could cause issues in large clusters.

Consider appending a short hash of the full node name:

 func executeOnNode(nodeName, namespace string, cmd []string) (string, error) {
-	debugPodName := "sriov-debug-" + strings.ReplaceAll(nodeName, ".", "-")
-	if len(debugPodName) > 63 {
-		debugPodName = debugPodName[:63]
-	}
+	sanitized := strings.ReplaceAll(nodeName, ".", "-")
+	if len(sanitized) > 50 {
+		// Use first 45 chars + hash suffix to ensure uniqueness
+		sanitized = sanitized[:45] + fmt.Sprintf("%x", sha256.Sum256([]byte(nodeName)))[:4]
+	}
+	debugPodName := "sriov-debug-" + sanitized
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3aadda1a59b63d949ccf6eda2279a4a8026450ba and f4504af64e28d3de907288e0a6cb1ebf233dd35e.

📒 Files selected for processing (5)
  • tests/ocp/sriov/internal/ocpsriovconfig/config.go (5 hunks)
  • tests/ocp/sriov/internal/ocpsriovconfig/default.yaml (1 hunks)
  • tests/ocp/sriov/internal/sriovenv/sriovenv.go (1 hunks)
  • tests/ocp/sriov/internal/tsparams/consts.go (2 hunks)
  • tests/ocp/sriov/internal/tsparams/ocpsriovvars.go (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/ocp/sriov/internal/ocpsriovconfig/default.yaml
🧰 Additional context used
🧬 Code graph analysis (2)
tests/ocp/sriov/internal/tsparams/ocpsriovvars.go (1)
tests/ocp/sriov/internal/tsparams/consts.go (1)
  • LabelSuite (10-10)
tests/ocp/sriov/internal/ocpsriovconfig/config.go (3)
tests/cnf/ran/ptp/internal/iface/iface.go (1)
  • Name (22-22)
tests/ocp/internal/ocpconfig/config.go (1)
  • OcpConfig (20-22)
tests/ocp/sriov/internal/ocpsriovinittools/ocpsriovinittools.go (1)
  • SriovOcpConfig (13-13)
🔇 Additional comments (14)
tests/ocp/sriov/internal/tsparams/ocpsriovvars.go (1)

1-19: LGTM!

The documentation improvements clarify the intended usage of suite-level vs feature-specific labels.

tests/ocp/sriov/internal/ocpsriovconfig/config.go (4)

26-57: LGTM!

The DeviceConfig struct and default devices list provide a clean abstraction for SR-IOV device configuration with appropriate vendor-specific feature flags.


36-43: LGTM!

The new Devices and VFNum fields with proper YAML/envconfig tags extend the configuration model cleanly.


106-122: LGTM!

Accessor methods properly return defaults when configuration is not explicitly set.


181-232: LGTM!

The environment variable parsing handles validation, empty entries, and optional fields gracefully with appropriate warning logs.

tests/ocp/sriov/internal/tsparams/consts.go (1)

1-73: LGTM on the remaining constants.

The timeout constants provide good defaults for SR-IOV operations, and the test configuration values (MTU, VLAN, rate limits) are reasonable for testing purposes.

tests/ocp/sriov/internal/sriovenv/sriovenv.go (8)

39-81: LGTM!

The operator and cluster stability functions properly delegate responsibilities and include appropriate nil checks before accessing nested status fields.


87-137: LGTM!

Clean functional options pattern that provides flexible network configuration while maintaining type safety.


205-279: LGTM!

The VF initialization logic correctly iterates through worker nodes and creates a policy on the first successful node. The interface discovery fallback and cleanup behavior are appropriate.


362-454: LGTM!

Pod creation and readiness verification functions properly handle creation, timeout, and container status checks.


460-503: LGTM!

Interface verification functions use standard ip link output parsing which is appropriate for test utilities.


509-585: LGTM!

Comprehensive VF status verification with proper cleanup handling and clear error messages.


587-638: LGTM!

Returning (false, nil) on carrier timeout allows calling tests to decide whether to skip rather than fail, which is appropriate for hardware-dependent verification.


728-850: LGTM!

GetPciAddress correctly parses the CNI network-status annotation, and CleanupLeftoverResources implements appropriate best-effort cleanup with logging.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
tests/ocp/sriov/internal/sriovenv/sriovenv.go (1)

303-305: Add service disruption warning to docstring.

The delete-and-recreate approach causes VFs to be deallocated and reallocated, resulting in brief service disruption for any pods using this policy. This should be documented to help callers decide when to use this function.

Apply this diff:

 // UpdateSriovPolicyMTU updates the MTU of an existing SR-IOV policy.
 // PolicyBuilder does not have Update method, so we delete and recreate.
+// Note: This causes brief service disruption as VFs are deallocated and reallocated.
 func UpdateSriovPolicyMTU(policyName string, mtuValue int) error {
tests/ocp/sriov/internal/ocpsriovconfig/config.go (1)

199-207: Handle io.EOF for empty YAML files.

yaml.Decoder.Decode returns io.EOF when decoding an empty file. Since SR-IOV fields have defaults or can be overridden via environment variables, an empty YAML file should not be a fatal error.

+import "io"
+
 func readFile(sriovOcpConfig *SriovOcpConfig, cfgFile string) error {
 	// ...
 	decoder := yaml.NewDecoder(openedCfgFile)
 
 	err = decoder.Decode(&sriovOcpConfig)
-	if err != nil {
+	if err != nil && err != io.EOF {
 		return err
 	}
 
 	return nil
 }
🧹 Nitpick comments (2)
tests/ocp/sriov/internal/sriovenv/sriovenv.go (1)

559-568: Consider explicit parameter instead of parsing description.

The spoof check verification is triggered by parsing the description string for "spoof checking". This is fragile and could lead to unexpected behavior if the description format changes. Consider adding an explicit parameter or options struct.

Example alternative approach:

type VFStatusOptions struct {
    VerifySpoofCheck bool
    SpoofCheckState  string // "on" or "off"
}

func CheckVFStatusWithPassTraffic(networkName, interfaceName, namespace, description string,
    timeout time.Duration, opts *VFStatusOptions) error {
    // ...
    if opts != nil && opts.VerifySpoofCheck {
        if err := verifySpoofCheck(clientPod, interfaceName, opts.SpoofCheckState); err != nil {
            return fmt.Errorf("spoof check verification failed: %w", err)
        }
    }
    // ...
}
tests/ocp/sriov/internal/ocpsriovconfig/config.go (1)

112-149: Document default value for minTxRate in env format.

The supportsMinTxRate defaults to true when omitted from the environment variable format. This should be documented in the function comment to avoid confusion.

 // parseSriovDevicesEnv parses device configuration from SRIOV_DEVICES environment variable.
 // Format: "name1:deviceid1:vendor1:interface1[:minTxRate],name2:deviceid2:vendor2:interface2[:minTxRate],..."
 // Example: "e810xxv:159b:8086:ens2f0,e810c:1593:8086:ens2f2".
+// Note: minTxRate defaults to true if omitted.
 func parseSriovDevicesEnv(envDevices string) ([]DeviceConfig, error) {
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f4504af64e28d3de907288e0a6cb1ebf233dd35e and 0ddb446523c4a08113b730e29d3ca5f642cda0a0.

📒 Files selected for processing (3)
  • tests/ocp/sriov/internal/ocpsriovconfig/.default.yaml (1 hunks)
  • tests/ocp/sriov/internal/ocpsriovconfig/config.go (5 hunks)
  • tests/ocp/sriov/internal/sriovenv/sriovenv.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/ocp/sriov/internal/ocpsriovconfig/config.go (2)
tests/ocp/internal/ocpconfig/config.go (1)
  • OcpConfig (20-22)
tests/ocp/sriov/internal/ocpsriovinittools/ocpsriovinittools.go (1)
  • SriovOcpConfig (13-13)
🔇 Additional comments (4)
tests/ocp/sriov/internal/sriovenv/sriovenv.go (1)

39-42: LGTM - Well-structured SR-IOV environment helpers.

The overall implementation follows good patterns:

  • Functional options for network configuration
  • Proper cleanup with defer statements
  • Best-effort cleanup with logged warnings
  • Good use of eco-goinfra builders
tests/ocp/sriov/internal/ocpsriovconfig/.default.yaml (1)

1-50: LGTM - Comprehensive default device configurations.

The configuration covers major SR-IOV NIC vendors (Intel, Broadcom, Mellanox) with appropriate device IDs and default settings. The supports_min_tx_rate field correctly reflects hardware capabilities.

tests/ocp/sriov/internal/ocpsriovconfig/config.go (2)

218-239: LGTM - Legacy VFNum handling looks correct.

The logic correctly prioritizes ECO_OCP_SRIOV_VF_NUM over the legacy SRIOV_VF_NUM, and only applies the legacy value when the new env var is not explicitly set.


24-41: LGTM - Clean configuration structure.

The DeviceConfig type and updated SriovOcpConfig fields provide a clean API for SR-IOV device configuration with proper YAML and environment variable tags.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/ocp/sriov/internal/ocpsriovconfig/config.go (1)

189-207: Fix YAML decode target: currently decoding into **SriovOcpConfig (likely no-op / wrong).

readFile receives sriovOcpConfig *SriovOcpConfig but decodes into &sriovOcpConfig (Line 201), i.e., **SriovOcpConfig. This typically won’t populate the struct you expect and can leave config fields at zero values.

Also, consider treating empty YAML (io.EOF) as non-fatal (per prior review note).

 import (
+	"errors"
 	"fmt"
+	"io"
 	"log"
 	"os"
 	"path/filepath"
 	"runtime"
 	"strconv"
 	"strings"
@@
 func readFile(sriovOcpConfig *SriovOcpConfig, cfgFile string) error {
@@
 	decoder := yaml.NewDecoder(openedCfgFile)
 
-	err = decoder.Decode(&sriovOcpConfig)
-	if err != nil {
+	err = decoder.Decode(sriovOcpConfig)
+	if err != nil && !errors.Is(err, io.EOF) {
 		return err
 	}
 
 	return nil
 }
♻️ Duplicate comments (2)
tests/ocp/sriov/internal/sriovenv/sriovenv.go (1)

303-356: Document disruption + wait for readiness after MTU “update” (delete/recreate).

This delete+recreate path can briefly deallocate/reallocate VFs (service disruption) and should be called out in the docstring (requested previously). Also, after recreating, it should likely WaitForSriovPolicyReady(...) (or equivalent) so callers don’t proceed while the cluster is still reconciling.

tests/ocp/sriov/internal/ocpsriovconfig/config.go (1)

218-239: Legacy SRIOV_VF_NUM currently overrides YAML vf_num (likely unintended).

parseLegacyVFNum only gates on ECO_OCP_SRIOV_VF_NUM (Line 221-224), so if YAML sets vf_num and legacy SRIOV_VF_NUM is present, legacy will overwrite YAML. If the intent is “legacy only when VFNum is still unset”, add a VFNum guard:

 func parseLegacyVFNum(sriovOcpConfig *SriovOcpConfig) {
 	// Skip if the new env var was explicitly set
 	if os.Getenv("ECO_OCP_SRIOV_VF_NUM") != "" {
 		return
 	}
+
+	// Skip if YAML already provided a value
+	if sriovOcpConfig.VFNum > 0 {
+		return
+	}
@@
 	sriovOcpConfig.VFNum = vfNum
 }
🧹 Nitpick comments (1)
tests/ocp/sriov/internal/ocpsriovconfig/config.go (1)

151-159: Prefer a named default VF constant (avoid magic “2”).

GetVFNum hardcodes 2 (Line 155). Consider const DefaultVFNum = 2 shared across config/parsers, so future changes don’t miss call sites.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0ddb446523c4a08113b730e29d3ca5f642cda0a0 and 5ad8751098b250e4498913893f9fa67ac278790f.

📒 Files selected for processing (4)
  • .cursorrules (1 hunks)
  • tests/ocp/sriov/internal/ocpsriovconfig/.default.yaml (1 hunks)
  • tests/ocp/sriov/internal/ocpsriovconfig/config.go (5 hunks)
  • tests/ocp/sriov/internal/sriovenv/sriovenv.go (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • .cursorrules
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/ocp/sriov/internal/ocpsriovconfig/.default.yaml
🧰 Additional context used
🧬 Code graph analysis (2)
tests/ocp/sriov/internal/ocpsriovconfig/config.go (2)
tests/cnf/ran/ptp/internal/iface/iface.go (1)
  • Name (22-22)
tests/ocp/internal/ocpconfig/config.go (1)
  • OcpConfig (20-22)
tests/ocp/sriov/internal/sriovenv/sriovenv.go (2)
tests/internal/sriovoperator/sriovenv.go (1)
  • IsSriovDeployed (32-54)
tests/ocp/sriov/internal/tsparams/consts.go (13)
  • MCPStableInterval (38-38)
  • NADTimeout (40-40)
  • PollingInterval (28-28)
  • NamespaceTimeout (30-30)
  • PolicyApplicationTimeout (42-42)
  • PodReadyTimeout (32-32)
  • TestPodClientIP (51-51)
  • TestPodClientMAC (47-47)
  • CleanupTimeout (34-34)
  • TestPodServerIP (53-53)
  • TestPodServerMAC (49-49)
  • CarrierWaitTimeout (36-36)
  • DebugPodCleanupTimeout (44-44)

Comment on lines 112 to 149
// parseSriovDevicesEnv parses device configuration from SRIOV_DEVICES environment variable.
// Format: "name1:deviceid1:vendor1:interface1[:minTxRate],name2:deviceid2:vendor2:interface2[:minTxRate],..."
// Example: "e810xxv:159b:8086:ens2f0,e810c:1593:8086:ens2f2".
func parseSriovDevicesEnv(envDevices string) ([]DeviceConfig, error) {
var devices []DeviceConfig

entries := strings.Split(envDevices, ",")
for _, entry := range entries {
entry = strings.TrimSpace(entry)
if entry == "" {
continue
}

parts := strings.Split(entry, ":")
if len(parts) != 4 && len(parts) != 5 {
return nil, fmt.Errorf(
"invalid SRIOV_DEVICES entry %q - expected format: name:deviceid:vendor:interface[:minTxRate]",
entry)
}

supportsMinTxRate := true // default

if len(parts) == 5 {
val := strings.ToLower(strings.TrimSpace(parts[4]))
supportsMinTxRate = val == "true" || val == "1" || val == "yes"
}

devices = append(devices, DeviceConfig{
Name: strings.TrimSpace(parts[0]),
DeviceID: strings.TrimSpace(parts[1]),
Vendor: strings.TrimSpace(parts[2]),
InterfaceName: strings.TrimSpace(parts[3]),
SupportsMinTxRate: supportsMinTxRate,
})
}

return devices, nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find . -name "config.go" -type f -path "*/ocpsriovconfig/*"

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 124


🏁 Script executed:

find . -name "*.go" -path "*sriov*" | grep -i config

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 187


🏁 Script executed:

rg "SupportsMinTxRate" -B 5 -A 5

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 1858


🏁 Script executed:

rg "parseDevicesFromYAML|parseDevices|YAML" tests/

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 3991


🏁 Script executed:

git ls-files tests/ocp/sriov/internal/ocpsriovconfig/

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 176


🏁 Script executed:

rg "DeviceConfig" -B 3 -A 10 --type go | head -100

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 8335


🏁 Script executed:

rg "yaml|YAML" tests/ocp/sriov/ -A 3 -B 3 --type go | head -150

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 6156


🏁 Script executed:

# Search for Unmarshal or YAML parsing
rg "Unmarshal|yaml" tests/ocp/sriov/ --type go -B 2 -A 5

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 4197


🏁 Script executed:

# Search for DeviceConfig struct in the entire repository
rg "type DeviceConfig" --type go -A 15

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 1709


🏁 Script executed:

# Also search for files that might parse SRIOV config
rg "SRIOV_DEVICES" --type go -B 2 -A 2

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 2351


🏁 Script executed:

# Search for any Go files that contain both DeviceConfig and SupportsMinTxRate
rg "DeviceConfig|SupportsMinTxRate" --type go -l | head -10

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 533


🏁 Script executed:

# Search across all files for SupportsMinTxRate
rg "SupportsMinTxRate" --type go

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 269


🏁 Script executed:

# Also check if there's any reference to parsing devices from YAML
find . -name "*.go" -type f -exec grep -l "Devices.*yaml\|YAML.*[Dd]evices" {} \;

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 124


🏁 Script executed:

# Check if there are any YAML-related parsing patterns
git ls-files "*.go" | xargs grep -l "Devices.*yaml\|json:" 2>/dev/null | head -20

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 1171


🏁 Script executed:

# Count files and show what's in tests
git ls-files tests/ | head -30

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 1578


🏁 Script executed:

# Search for the exact comment text
rg "parseSriovDevicesEnv parses device" --type go

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 215


🏁 Script executed:

# Try to find any file with the function parseSriovDevicesEnv
git log --all --name-only --pretty="" | grep -i sriov | sort -u | head -20

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 1145


🏁 Script executed:

# Search for minTxRate in the codebase
rg "minTxRate|MinTxRate" --type go -B 3 -A 3

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 4474


🏁 Script executed:

# Check if there are any recent commits with sriov
git log --oneline --all | grep -i sriov | head -10

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 120


🏁 Script executed:

# Search for struct with Name, DeviceID, Vendor fields
rg "Name.*string|DeviceID|Vendor" --type go | grep -i "type\|struct" | head -20

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 2753


🏁 Script executed:

cat -n tests/ocp/sriov/internal/ocpsriovconfig/config.go | sed -n '1,160p'

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 5860


🏁 Script executed:

# Also check if there's any post-processing of YAML that might set a default
cat -n tests/ocp/sriov/internal/ocpsriovconfig/config.go | sed -n '160,220p'

Repository: rh-ecosystem-edge/eco-gotests

Length of output: 2135


SupportsMinTxRate defaults differ between env and YAML parsing.

parseSriovDevicesEnv defaults SupportsMinTxRate to true (line 132), but YAML omission yields Go's zero value false. Align the defaults: either change env default to false, or use *bool to distinguish "unset" from "false" and apply a single consistent default in one location.

🤖 Prompt for AI Agents
In tests/ocp/sriov/internal/ocpsriovconfig/config.go around lines 112–149,
parseSriovDevicesEnv currently defaults SupportsMinTxRate to true causing
inconsistent behavior with YAML parsing (which yields false when omitted);
change the env parser to default supportsMinTxRate to false (or alternatively
parse into a *bool and keep a single canonical default applied later) and ensure
the parsed boolean logic only sets SupportsMinTxRate when an explicit 5th field
is present so both env and YAML produce the same default behavior.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (9)
tests/ocp/sriov/internal/ocpsriovconfig/config.go (2)

199-207: Handle io.EOF for empty YAML files.

yaml.Decoder.Decode returns io.EOF for empty files. Since SR-IOV-specific fields have defaults or can be set via environment variables, an empty YAML file should not be a fatal error.

+import "io"
+
 	err = decoder.Decode(&sriovOcpConfig)
-	if err != nil {
+	if err != nil && err != io.EOF {
 		return err
 	}

132-137: SupportsMinTxRate defaults differ between env and YAML parsing.

When parsing from environment variable, SupportsMinTxRate defaults to true (line 132), but when parsing from YAML with the field omitted, Go's zero value yields false. This inconsistency may cause unexpected behavior depending on how devices are configured.

Consider aligning the defaults by either:

  1. Defaulting to false in env parsing to match YAML behavior
  2. Using *bool to distinguish "unset" from "false" and apply a consistent default
tests/ocp/sriov/internal/sriovenv/sriovenv.go (7)

53-81: Silently swallowing poll errors can mask persistent failures.

List errors return (false, nil) which causes the poll to continue retrying. If the failure is persistent (RBAC, wrong namespace, API down), callers wait until timeout with no diagnostic signal.

At minimum, log errors to aid debugging:

 		func(ctx context.Context) (bool, error) {
 			// Check SR-IOV node states synced
 			nodeStates, err := sriov.ListNetworkNodeState(APIClient, sriovOpNs, client.ListOptions{})
-			if err != nil || len(nodeStates) == 0 {
+			if err != nil {
+				klog.V(90).Infof("ListNetworkNodeState failed: %v", err)
+				return false, nil
+			}
+			if len(nodeStates) == 0 {
 				return false, nil
 			}
 
 			// ...
 
 			// Check MCP stability
 			mcpList, err := mco.ListMCP(APIClient)
 			if err != nil {
+				klog.V(90).Infof("ListMCP failed: %v", err)
 				return false, nil
 			}

167-203: Use apierrors.IsNotFound() to distinguish resource-not-found from other errors.

RemoveSriovNetwork and RemoveSriovPolicy return nil on any pull error, which masks real problems like RBAC failures or transport errors.

Differentiate NotFound errors using k8s.io/apimachinery/pkg/api/errors.IsNotFound():

+import apierrors "k8s.io/apimachinery/pkg/api/errors"
+
 func RemoveSriovNetwork(name string, timeout time.Duration) error {
 	sriovOpNs := SriovOcpConfig.OcpSriovOperatorNamespace
 
 	network, err := sriov.PullNetwork(APIClient, name, sriovOpNs)
 	if err != nil {
-		return nil // Already deleted
+		if apierrors.IsNotFound(err) {
+			return nil // Already deleted
+		}
+		return fmt.Errorf("failed to pull network %q: %w", name, err)
 	}
 
 	return network.DeleteAndWait(timeout)
 }
 
 func RemoveSriovPolicy(name string, timeout time.Duration) error {
 	sriovOpNs := SriovOcpConfig.OcpSriovOperatorNamespace
 
 	policy, err := sriov.PullPolicy(APIClient, name, sriovOpNs)
 	if err != nil {
-		return nil // Already deleted
+		if apierrors.IsNotFound(err) {
+			return nil // Already deleted
+		}
+		return fmt.Errorf("failed to pull policy %q: %w", name, err)
 	}

224-279: Validate vfNum and interface before building pfSelector.

pfSelector := fmt.Sprintf("%s#0-%d", actualInterface, vfNum-1) (line 245) produces invalid output for vfNum <= 0 (e.g., "ens1f0#0--1"), and actualInterface can remain empty if discovery fails.

 func initVFWithDevType(name, deviceID, interfaceName, vendor, devType string, vfNum int,
 	workerNodes []*nodes.Builder) (bool, error) {
+	if vfNum <= 0 {
+		return false, fmt.Errorf("vfNum must be > 0, got %d", vfNum)
+	}
+
 	sriovOpNs := SriovOcpConfig.OcpSriovOperatorNamespace
 
 	// Cleanup existing policy
 	_ = RemoveSriovPolicy(name, tsparams.NamespaceTimeout)
 
 	for _, node := range workerNodes {
 		nodeName := node.Definition.Name
 
 		// Try to discover the actual interface name
 		actualInterface := interfaceName
 
 		// Only discover interface if not explicitly provided
 		if interfaceName == "" && vendor != "" && deviceID != "" {
 			if discovered, err := discoverInterfaceName(nodeName, vendor, deviceID); err == nil {
 				actualInterface = discovered
 			}
 		}
+
+		if actualInterface == "" {
+			klog.V(90).Infof("No interface resolved on node %q (vendor=%q deviceID=%q)", nodeName, vendor, deviceID)
+			continue
+		}
 
 		pfSelector := fmt.Sprintf("%s#0-%d", actualInterface, vfNum-1)

303-306: Document potential service disruption.

The delete-and-recreate approach causes brief service disruption as VFs are deallocated and reallocated. Add a note to help users understand when to use this function.

 // UpdateSriovPolicyMTU updates the MTU of an existing SR-IOV policy.
 // PolicyBuilder does not have Update method, so we delete and recreate.
+// Note: This causes brief service disruption as VFs are deallocated and reallocated.
 func UpdateSriovPolicyMTU(policyName string, mtuValue int) error {

404-414: Use apierrors.IsNotFound() for pod deletion.

Same issue as network/policy removal - returning nil on any pull error masks real problems.

 func DeleteDpdkTestPod(name, namespace string, timeout time.Duration) error {
 	podBuilder, err := pod.Pull(APIClient, name, namespace)
 	if err != nil {
-		return nil // Already deleted
+		if apierrors.IsNotFound(err) {
+			return nil // Already deleted
+		}
+		return fmt.Errorf("failed to pull pod %q: %w", name, err)
 	}
 
 	_, err = podBuilder.DeleteAndWait(timeout)
 
 	return err
 }

509-537: Hard-coded pod names can collide across retries or parallel runs.

Pods named "client" and "server" may conflict if tests share a namespace or rerun quickly. Consider suffixing with a timestamp/random string or allowing caller-provided names.

+import "github.com/google/uuid"
+
 func CheckVFStatusWithPassTraffic(networkName, interfaceName, namespace, description string,
 	timeout time.Duration) error {
+	suffix := uuid.New().String()[:8]
+	clientName := fmt.Sprintf("client-%s", suffix)
+	serverName := fmt.Sprintf("server-%s", suffix)
+
 	// Create client and server pods
-	clientPod, err := CreateTestPod("client", namespace, networkName,
+	clientPod, err := CreateTestPod(clientName, namespace, networkName,
 		tsparams.TestPodClientIP, tsparams.TestPodClientMAC)

771-842: Cleanup logic may be destructive in shared clusters.

Deleting any namespace prefixed with "e2e-" and policies matching device names could affect non-test resources in shared clusters.

Consider scoping cleanup by a label/annotation that your tests explicitly set, and use label selectors instead of name heuristics:

+const TestResourceLabel = "eco-gotests.sriov/test-resource"
+
 // In test setup, add label to created resources:
 // namespace.WithLabel(TestResourceLabel, "true")
 // policy.WithLabel(TestResourceLabel, "true")
 
 // In cleanup:
 namespaces, err := namespace.List(APIClient, metav1.ListOptions{
+	LabelSelector: TestResourceLabel + "=true",
 })
🧹 Nitpick comments (2)
tests/ocp/sriov/internal/ocpsriovconfig/config.go (1)

151-159: Consider extracting the fallback value as a constant.

The fallback value 2 is hardcoded. Consider using a named constant for clarity and maintainability.

+const (
+	// DefaultVFNum is the default number of virtual functions.
+	DefaultVFNum = 2
+)
+
 // GetVFNum returns the configured number of virtual functions.
-// Returns the value from YAML/env config, or 2 as fallback if not set.
+// Returns the value from YAML/env config, or DefaultVFNum as fallback if not set.
 func (sriovOcpConfig *SriovOcpConfig) GetVFNum() int {
 	if sriovOcpConfig.VFNum <= 0 {
-		return 2 // fallback if YAML wasn't loaded properly
+		return DefaultVFNum // fallback if YAML wasn't loaded properly
 	}
 
 	return sriovOcpConfig.VFNum
 }
tests/ocp/sriov/internal/sriovenv/sriovenv.go (1)

21-21: Consider avoiding dot import for better readability.

Dot imports (. "github.com/...") can make it harder to understand where symbols like APIClient and SriovOcpConfig come from. Consider using a named import instead.

-	. "github.com/rh-ecosystem-edge/eco-gotests/tests/ocp/sriov/internal/ocpsriovinittools"
+	"github.com/rh-ecosystem-edge/eco-gotests/tests/ocp/sriov/internal/ocpsriovinittools"

Then reference as ocpsriovinittools.APIClient, etc.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5ad8751098b250e4498913893f9fa67ac278790f and 88e6a88.

📒 Files selected for processing (5)
  • .cursorrules (1 hunks)
  • .gitignore (1 hunks)
  • tests/ocp/sriov/internal/ocpsriovconfig/.default.yaml (1 hunks)
  • tests/ocp/sriov/internal/ocpsriovconfig/config.go (5 hunks)
  • tests/ocp/sriov/internal/sriovenv/sriovenv.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/ocp/sriov/internal/ocpsriovconfig/.default.yaml
  • .cursorrules
🧰 Additional context used
🧬 Code graph analysis (2)
tests/ocp/sriov/internal/ocpsriovconfig/config.go (3)
tests/cnf/ran/ptp/internal/iface/iface.go (1)
  • Name (22-22)
tests/ocp/internal/ocpconfig/config.go (1)
  • OcpConfig (20-22)
tests/ocp/sriov/internal/ocpsriovinittools/ocpsriovinittools.go (1)
  • SriovOcpConfig (13-13)
tests/ocp/sriov/internal/sriovenv/sriovenv.go (2)
tests/internal/sriovoperator/sriovenv.go (1)
  • IsSriovDeployed (32-54)
tests/ocp/sriov/internal/tsparams/consts.go (13)
  • MCPStableInterval (38-38)
  • NADTimeout (40-40)
  • PollingInterval (28-28)
  • NamespaceTimeout (30-30)
  • PolicyApplicationTimeout (42-42)
  • PodReadyTimeout (32-32)
  • TestPodClientIP (51-51)
  • TestPodClientMAC (47-47)
  • CleanupTimeout (34-34)
  • TestPodServerIP (53-53)
  • TestPodServerMAC (49-49)
  • CarrierWaitTimeout (36-36)
  • DebugPodCleanupTimeout (44-44)
🔇 Additional comments (8)
.gitignore (1)

24-24: LGTM!

Ignoring .cursorrules is appropriate to prevent IDE-specific configuration from being committed to the repository.

tests/ocp/sriov/internal/ocpsriovconfig/config.go (4)

24-41: LGTM!

The DeviceConfig struct and SriovOcpConfig extensions are well-structured with appropriate YAML and envconfig tags.


57-62: LGTM!

Good defensive check for runtime.Caller failure. This prevents potential nil dereference issues.


87-110: LGTM!

The fallback logic from environment variable to YAML configuration is well-implemented with appropriate error handling.


218-239: LGTM!

The legacy environment variable handling is correctly implemented with proper validation and warning logging for invalid values.

tests/ocp/sriov/internal/sriovenv/sriovenv.go (3)

87-155: LGTM!

The functional options pattern for NetworkOption is well-implemented and provides a clean, extensible API for network configuration.


683-722: LGTM!

Good handling of Kubernetes name length limits and proper cleanup of existing debug pods before creation. The nsenter approach for host namespace access is appropriate.


728-765: LGTM!

The PCI address extraction from network status annotation is correctly implemented with proper handling of namespaced network names.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (8)
tests/ocp/sriov/internal/sriovenv/sriovenv.go (8)

273-276: Document potential service disruption in docstring.

The delete-and-recreate approach causes VFs to be deallocated and reallocated, briefly disrupting traffic on affected pods. Add this to the function's documentation to help users understand when to use this function.

🔎 Proposed fix
 // UpdateSriovPolicyMTU updates the MTU of an existing SR-IOV policy.
 // PolicyBuilder does not have Update method, so we delete and recreate.
+// Note: This causes brief service disruption as VFs are deallocated and reallocated.
+// Schedule this operation during maintenance windows or when temporary disruption is acceptable.
 func UpdateSriovPolicyMTU(policyName string, mtuValue int) error {

485-498: Static pod names "client" and "server" can collide across parallel runs or retries.

Consider generating unique names (e.g., with timestamp or random suffix) or accepting caller-provided names to avoid AlreadyExists errors.


559-563: Static pod name "linkstate-test" has the same collision risk.

Consider using a unique suffix as recommended for client/server pods.


713-750: Consider scoping cleanup by labels for safer operation in shared clusters.

While using the "e2e-" prefix is reasonable, and delegating to sriovoperator.RemoveAllSriovNetworks/Policies is appropriate, in shared clusters this could affect resources from other concurrent test runs. Consider adding a unique test-run label to resources and filtering by that label during cleanup for better isolation.


137-147: Use apierrors.IsNotFound() to distinguish resource-not-found from other errors.

Returning nil on any Pull error masks real problems like RBAC failures or transport errors. Only return nil for NotFound errors; propagate others.

🔎 Proposed fix
+import (
+	apierrors "k8s.io/apimachinery/pkg/api/errors"
+)

 func RemoveSriovNetwork(name string, timeout time.Duration) error {
 	sriovOpNs := SriovOcpConfig.OcpSriovOperatorNamespace

 	network, err := sriov.PullNetwork(APIClient, name, sriovOpNs)
 	if err != nil {
-		return nil // Already deleted
+		if apierrors.IsNotFound(err) {
+			return nil // Already deleted
+		}
+		return fmt.Errorf("failed to pull network %q: %w", name, err)
 	}

 	return network.DeleteAndWait(timeout)
 }

153-173: Same error handling issue: use apierrors.IsNotFound().

Apply the same fix as suggested for RemoveSriovNetwork to properly distinguish NotFound from other errors.


194-249: Validate vfNum and actualInterface before building pfSelector.

Line 215 creates an invalid range if vfNum <= 0 (e.g., "interface#0--1") or invalid selector if actualInterface is empty (e.g., "#0-2"). Add upfront validation to fail fast with clear error messages.

🔎 Proposed fix
 func initVFWithDevType(name, deviceID, interfaceName, vendor, devType string, vfNum int,
 	workerNodes []*nodes.Builder) (bool, error) {
+	if vfNum <= 0 {
+		return false, fmt.Errorf("vfNum must be > 0, got %d", vfNum)
+	}
+
 	sriovOpNs := SriovOcpConfig.OcpSriovOperatorNamespace

 	// Cleanup existing policy
 	_ = RemoveSriovPolicy(name, tsparams.NamespaceTimeout)

 	for _, node := range workerNodes {
 		nodeName := node.Definition.Name

 		// Try to discover the actual interface name
 		actualInterface := interfaceName

 		// Only discover interface if not explicitly provided
 		if interfaceName == "" && vendor != "" && deviceID != "" {
 			if discovered, err := discoverInterfaceName(nodeName, vendor, deviceID); err == nil {
 				actualInterface = discovered
 			}
 		}

+		if actualInterface == "" {
+			klog.V(90).Infof("No interface resolved on node %q (vendor=%q deviceID=%q), skipping", nodeName, vendor, deviceID)
+			continue
+		}
+
 		pfSelector := fmt.Sprintf("%s#0-%d", actualInterface, vfNum-1)

374-384: Same error handling issue: use apierrors.IsNotFound().

Apply the same fix to distinguish NotFound from other errors.

🔎 Proposed fix
 func DeleteDpdkTestPod(name, namespace string, timeout time.Duration) error {
 	podBuilder, err := pod.Pull(APIClient, name, namespace)
 	if err != nil {
-		return nil // Already deleted
+		if apierrors.IsNotFound(err) {
+			return nil // Already deleted
+		}
+		return fmt.Errorf("failed to pull pod %q: %w", name, err)
 	}

 	_, err = podBuilder.DeleteAndWait(timeout)

 	return err
 }
🧹 Nitpick comments (4)
tests/ocp/sriov/internal/ocpsriovconfig/default.yaml (1)

50-50: Add trailing newline.

YAML files should end with a newline character for better compatibility with text processing tools and version control systems.

🔎 Suggested fix
   - name: cx7
     device_id: "1021"
     vendor: "15b3"
     interface_name: "ens8f0np0"
     supports_min_tx_rate: true
+
tests/ocp/sriov/internal/sriovenv/sriovenv.go (3)

19-19: Consider avoiding dot imports for clarity.

Dot imports can make code harder to read as it's not immediately clear where APIClient and SriovOcpConfig come from. However, if this is an established pattern in the codebase for test utilities, it's acceptable.


352-372: Consider accepting IP/MAC as parameters for flexibility.

Unlike CreateTestPod, this function hardcodes IP and MAC from tsparams. This limits reuse and could cause conflicts if multiple DPDK pods need to coexist. Consider accepting these as parameters like CreateTestPod does.

🔎 Proposed signature
-func CreateDpdkTestPod(name, namespace, networkName string) (*pod.Builder, error) {
-	secNetwork := pod.StaticIPAnnotationWithMacAddress(networkName,
-		[]string{tsparams.TestPodClientIP}, tsparams.TestPodClientMAC)
+func CreateDpdkTestPod(name, namespace, networkName, ip, mac string) (*pod.Builder, error) {
+	secNetwork := pod.StaticIPAnnotationWithMacAddress(networkName, []string{ip}, mac)

127-135: Consider logging Pull errors for debugging persistent failures.

While returning (false, nil) is correct for polling, logging the error at a verbose level would help diagnose why NAD creation is taking longer than expected.

🔎 Proposed fix
 	return wait.PollUntilContextTimeout(context.Background(), tsparams.PollingInterval, timeout, true,
 		func(ctx context.Context) (bool, error) {
 			_, err := nad.Pull(APIClient, name, namespace)
-
-			return err == nil, nil
+			if err != nil {
+				klog.V(90).Infof("Waiting for NAD %s/%s: %v", namespace, name, err)
+				return false, nil
+			}
+			return true, nil
 		})
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 88e6a88 and f0dc4d9.

📒 Files selected for processing (3)
  • tests/ocp/sriov/internal/ocpsriovconfig/config.go (3 hunks)
  • tests/ocp/sriov/internal/ocpsriovconfig/default.yaml (1 hunks)
  • tests/ocp/sriov/internal/sriovenv/sriovenv.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/ocp/sriov/internal/ocpsriovconfig/config.go
🔇 Additional comments (4)
tests/ocp/sriov/internal/ocpsriovconfig/default.yaml (3)

6-8: LGTM!

The default VF count of 2 is reasonable for basic SR-IOV testing scenarios. The configuration aligns with the VFNum field mentioned in the related config.go changes.


36-45: Correct the device_id for cx6 from "101d" to "1018".

Device ID 101d maps to the MT2892 Family [ConnectX-6 Dx], not ConnectX-6. ConnectX-6 has a valid PCI Device ID of 1018. The cx6 entry (lines 36-40) has an incorrect device_id that should be changed to "1018" to match the official hardware specification.

Likely an incorrect or invalid review comment.


11-50: Review device ID configurations for potential errors.

The device IDs for Intel adapters have been verified against official specifications and are correct:

  • Intel e810xxv (159b) and e810c (1593) are correctly specified
  • Intel x710 (1572) is correct

However, there is a critical issue with the Mellanox devices: Both cx6 and cx6dx are assigned device_id "101d", which corresponds to MT2892 Family [ConnectX-6 Dx]. This is incorrect for cx6 since the same device ID cannot apply to two different adapter types. The cx6 entry likely has the wrong device_id and should be verified.

Verify that cx7 (1021) corresponds to MT2910 Family [ConnectX-7] — this is correct.

The Broadcom device IDs (bcm57414: 16d7, bcm57508: 1750) could not be verified against official public sources.

tests/ocp/sriov/internal/sriovenv/sriovenv.go (1)

670-707: LGTM!

The implementation correctly uses networkStatus.Interface (e.g., "net1", "net2") which is unique per pod, addressing the concern about pods with multiple SR-IOV interfaces using the same network.

- Add DeviceConfig struct for SR-IOV device configuration
- Add GetSriovDevices() method with ECO_OCP_SRIOV_DEVICES env var support
- Add GetVFNum() method with error handling
- Move all defaults to default.yaml
- Fix suite-level labels to exclude feature-specific labels
- Add constants for timeouts, labels, and test parameters
@zhiqiangf zhiqiangf force-pushed the pr2-sriov-sriovenv branch 2 times, most recently from fca5cf1 to 11f99bf Compare December 23, 2025 03:08
- Add SR-IOV environment helper functions using eco-goinfra
- Use existing sriovoperator functions for stability checks and cleanup
- Add network creation with NetworkOption pattern
- Add policy creation and VF initialization helpers
- Add pod verification and traffic testing utilities
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
tests/ocp/sriov/internal/ocpsriovconfig/config.go (1)

21-22: SupportsMinTxRate default inconsistency remains unresolved.

The past review comment about SupportsMinTxRate defaulting differently between env and YAML parsing is still valid. Line 22 defines defaultSupportsMinTxRate = true, and line 132 applies this default when parsing from the environment variable. However, YAML parsing yields false (Go's zero value) when the field is omitted.

This inconsistency means the same device configuration behaves differently depending on whether it's defined in YAML or via the ECO_OCP_SRIOV_DEVICES environment variable. While lines 113-114 document this difference, the documentation doesn't resolve the underlying problem.

Consider one of these solutions:

  1. Change the env default to false to match YAML behavior
  2. Use *bool in DeviceConfig to distinguish "unset" from explicit false, then apply a single consistent default in GetSriovDevices()

Also applies to: 110-149

🧹 Nitpick comments (1)
tests/ocp/sriov/internal/sriovenv/sriovenv.go (1)

397-435: Consider logging errors in poll callback to aid debugging.

WaitForPodWithLabelReady returns (false, nil) when pod.List fails (line 402), which silently swallows errors. If the failure is persistent (e.g., RBAC issues, wrong namespace, API unavailability), callers wait until timeout with no diagnostic signal.

Consider logging the error to help with debugging:

 		func(ctx context.Context) (bool, error) {
 			podList, err := pod.List(APIClient, namespace, metav1.ListOptions{LabelSelector: labelSelector})
-			if err != nil || len(podList) == 0 {
+			if err != nil {
+				klog.V(90).Infof("pod.List failed for label %q: %v", labelSelector, err)
 				return false, nil
 			}
+			if len(podList) == 0 {
+				return false, nil
+			}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 11f99bf and 16f530f.

📒 Files selected for processing (4)
  • tests/ocp/sriov/internal/ocpsriovconfig/config.go
  • tests/ocp/sriov/internal/ocpsriovconfig/default.yaml
  • tests/ocp/sriov/internal/sriovenv/sriovenv.go
  • tests/ocp/sriov/internal/tsparams/consts.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/ocp/sriov/internal/ocpsriovconfig/default.yaml
🧰 Additional context used
🧬 Code graph analysis (2)
tests/ocp/sriov/internal/ocpsriovconfig/config.go (1)
tests/ocp/internal/ocpconfig/config.go (1)
  • OcpConfig (20-22)
tests/ocp/sriov/internal/sriovenv/sriovenv.go (2)
tests/internal/sriovoperator/sriovenv.go (4)
  • IsSriovDeployed (32-54)
  • WaitForSriovAndMCPStable (80-100)
  • RemoveAllSriovNetworks (156-176)
  • RemoveAllPoliciesAndWaitForSriovAndMCPStable (180-195)
tests/internal/cluster/cluster.go (1)
  • ExecCmdWithStdout (98-180)
🔇 Additional comments (2)
tests/ocp/sriov/internal/tsparams/consts.go (1)

1-78: LGTM! Well-organized test constants.

The new constants are clearly documented and provide comprehensive test parameters for SR-IOV testing scenarios. The values are reasonable (timeouts, network configurations, VLAN/QoS settings, rate limits, and resource labeling).

tests/ocp/sriov/internal/sriovenv/sriovenv.go (1)

1-787: Comprehensive SR-IOV test environment helpers.

This file provides a well-structured and extensive set of SR-IOV test utilities that follow eco-goinfra patterns and return errors appropriately (as per PR objectives). The code properly handles:

  • Network and policy lifecycle with builder options
  • VF initialization with device discovery
  • Pod creation with unique naming to avoid collisions
  • Robust cleanup using label selectors
  • Proper NotFound error handling throughout

The implementation addresses all critical issues from past reviews and demonstrates good engineering practices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants