Skip to content

METAL-1715: Enforce 1-year validity and 30-day auto-rotation for Ironic TLS certs#570

Open
mabulgu wants to merge 1 commit intoopenshift:mainfrom
mabulgu:story/METAL-1715
Open

METAL-1715: Enforce 1-year validity and 30-day auto-rotation for Ironic TLS certs#570
mabulgu wants to merge 1 commit intoopenshift:mainfrom
mabulgu:story/METAL-1715

Conversation

@mabulgu
Copy link
Copy Markdown
Contributor

@mabulgu mabulgu commented Mar 4, 2026

Summary

  • Change TLS certificate validity from 2 years to 1 year and rotation trigger from 180 days to 30 days before expiration
  • Add unit tests verifying the exact validity period and rotation boundary conditions

Background

Per METAL-1687, the Ironic TLS certificate must be valid for exactly 1 year and automatically rotated 30 days before expiration. The METAL-1713 spike confirmed the existing custom rotation mechanism (Option C) is already sound — only the timing constants needed adjustment.

Parameter Before After
tlsExpiration 2 years (730 days) 1 year (365 days)
tlsRefresh 180 days 30 days

Dependencies

This PR depends on #569 (METAL-1714: Populate complete SANs in Ironic TLS certificate). Please merge that PR first, then this branch will need a rebase.

Test plan

  • TestCertificateValidityPeriod verifies generated certificates are valid for ~365 days
  • TestCertificateRotationBoundary covers 5 scenarios: rotation triggers at 29 and 30 days before expiration, no rotation at 31, 180, or 365 days
  • All existing tests pass with the new values
  • go test ./... passes
  • Tested on real cluster with accelerated expiration (short-lived cert)

Addresses: METAL-1715

Assisted-by: Claude 4.6 Opus High

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 4, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Mar 4, 2026

@mabulgu: This pull request references METAL-1715 which is a valid jira issue.

Details

In response to this:

Summary

  • Change TLS certificate validity from 2 years to 1 year and rotation trigger from 180 days to 30 days before expiration
  • Add unit tests verifying the exact validity period and rotation boundary conditions

Background

Per METAL-1687, the Ironic TLS certificate must be valid for exactly 1 year and automatically rotated 30 days before expiration. The METAL-1713 spike confirmed the existing custom rotation mechanism (Option C) is already sound — only the timing constants needed adjustment.

Parameter Before After
tlsExpiration 2 years (730 days) 1 year (365 days)
tlsRefresh 180 days 30 days

Dependencies

This PR depends on #569 (METAL-1714: Populate complete SANs in Ironic TLS certificate). Please merge that PR first, then this branch will need a rebase.

Test plan

  • TestCertificateValidityPeriod verifies generated certificates are valid for ~365 days
  • TestCertificateRotationBoundary covers 5 scenarios: rotation triggers at 29 and 30 days before expiration, no rotation at 31, 180, or 365 days
  • All existing tests pass with the new values
  • go test ./... passes
  • Tested on real cluster with accelerated expiration (short-lived cert)

Addresses: METAL-1715

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

1 similar comment
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Mar 4, 2026

@mabulgu: This pull request references METAL-1715 which is a valid jira issue.

Details

In response to this:

Summary

  • Change TLS certificate validity from 2 years to 1 year and rotation trigger from 180 days to 30 days before expiration
  • Add unit tests verifying the exact validity period and rotation boundary conditions

Background

Per METAL-1687, the Ironic TLS certificate must be valid for exactly 1 year and automatically rotated 30 days before expiration. The METAL-1713 spike confirmed the existing custom rotation mechanism (Option C) is already sound — only the timing constants needed adjustment.

Parameter Before After
tlsExpiration 2 years (730 days) 1 year (365 days)
tlsRefresh 180 days 30 days

Dependencies

This PR depends on #569 (METAL-1714: Populate complete SANs in Ironic TLS certificate). Please merge that PR first, then this branch will need a rebase.

Test plan

  • TestCertificateValidityPeriod verifies generated certificates are valid for ~365 days
  • TestCertificateRotationBoundary covers 5 scenarios: rotation triggers at 29 and 30 days before expiration, no rotation at 31, 180, or 365 days
  • All existing tests pass with the new values
  • go test ./... passes
  • Tested on real cluster with accelerated expiration (short-lived cert)

Addresses: METAL-1715

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from elfosardo and honza March 4, 2026 13:51
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Mar 4, 2026

@mabulgu: This pull request references METAL-1715 which is a valid jira issue.

Details

In response to this:

Summary

  • Change TLS certificate validity from 2 years to 1 year and rotation trigger from 180 days to 30 days before expiration
  • Add unit tests verifying the exact validity period and rotation boundary conditions

Background

Per METAL-1687, the Ironic TLS certificate must be valid for exactly 1 year and automatically rotated 30 days before expiration. The METAL-1713 spike confirmed the existing custom rotation mechanism (Option C) is already sound — only the timing constants needed adjustment.

Parameter Before After
tlsExpiration 2 years (730 days) 1 year (365 days)
tlsRefresh 180 days 30 days

Dependencies

This PR depends on #569 (METAL-1714: Populate complete SANs in Ironic TLS certificate). Please merge that PR first, then this branch will need a rebase.

Test plan

  • TestCertificateValidityPeriod verifies generated certificates are valid for ~365 days
  • TestCertificateRotationBoundary covers 5 scenarios: rotation triggers at 29 and 30 days before expiration, no rotation at 31, 180, or 365 days
  • All existing tests pass with the new values
  • go test ./... passes
  • Tested on real cluster with accelerated expiration (short-lived cert)

Addresses: METAL-1715

Assisted-by: Claude 4.6 Opus High

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@mabulgu
Copy link
Copy Markdown
Contributor Author

mabulgu commented Mar 9, 2026

/retest

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 12, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 577763f8-02a0-4dbe-9078-4af74db6244e

📥 Commits

Reviewing files that changed from the base of the PR and between b71de7a and 1cd18e7.

📒 Files selected for processing (2)
  • provisioning/baremetal_crypto.go
  • provisioning/baremetal_crypto_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • provisioning/baremetal_crypto_test.go
  • provisioning/baremetal_crypto.go

Walkthrough

Reduced TLS certificate lifetime from 2 years to 1 year and refresh window from 180 days to 30 days; introduced time-injectable expiration check isTlsCertificateExpiredAt(...) and adjusted refresh-threshold comparison to treat certs expiring at-or-before threshold as expired. Added tests validating certificate lifetime and rotation boundaries.

Changes

Cohort / File(s) Summary
Crypto implementation
provisioning/baremetal_crypto.go
Updated constants: tlsExpiration from 2 years to 1 year, tlsRefresh from 180 days to 30 days. Added isTlsCertificateExpiredAt(certificate []byte, now time.Time) (bool, error) and made isTlsCertificateExpired delegate to it. Changed refresh comparison to consider certs expiring at-or-before the threshold as expired.
Unit tests
provisioning/baremetal_crypto_test.go
Added generateTestCertWithLifetime(time.Duration) and two tests: TestCertificateValidityPeriod (verifies generated cert lifetime ≈ tlsExpiration) and TestCertificateRotationBoundary (table-driven checks of isTlsCertificateExpiredAt at boundary times).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 26.92% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and specifically addresses the main changes: enforcing 1-year validity and 30-day auto-rotation for Ironic TLS certificates, which aligns with the primary objectives and code modifications.
Stable And Deterministic Test Names ✅ Passed The codebase uses standard Go testing conventions with function names like TestGenerateRandomPassword and TestCertificateValidityPeriod, not Ginkgo test patterns. The custom check for stable and deterministic Ginkgo test names is not applicable.
Test Structure And Quality ✅ Passed Tests follow good practices with single responsibilities, meaningful assertions, and proper use of fake clients, though one assertion lacks a message.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Mar 12, 2026

@mabulgu: This pull request references METAL-1715 which is a valid jira issue.

Details

In response to this:

Summary

  • Change TLS certificate validity from 2 years to 1 year and rotation trigger from 180 days to 30 days before expiration
  • Add unit tests verifying the exact validity period and rotation boundary conditions

Background

Per METAL-1687, the Ironic TLS certificate must be valid for exactly 1 year and automatically rotated 30 days before expiration. The METAL-1713 spike confirmed the existing custom rotation mechanism (Option C) is already sound — only the timing constants needed adjustment.

Parameter Before After
tlsExpiration 2 years (730 days) 1 year (365 days)
tlsRefresh 180 days 30 days

Dependencies

This PR depends on #569 (METAL-1714: Populate complete SANs in Ironic TLS certificate). Please merge that PR first, then this branch will need a rebase.

Test plan

  • TestCertificateValidityPeriod verifies generated certificates are valid for ~365 days
  • TestCertificateRotationBoundary covers 5 scenarios: rotation triggers at 29 and 30 days before expiration, no rotation at 31, 180, or 365 days
  • All existing tests pass with the new values
  • go test ./... passes
  • Tested on real cluster with accelerated expiration (short-lived cert)

Addresses: METAL-1715

Assisted-by: Claude 4.6 Opus High

Summary by CodeRabbit

  • Improvements

  • TLS certificate validity period updated to 1 year (from 2 years).

  • Certificate refresh cycles shortened to 30 days (from 180 days) for enhanced security.

  • Added validation to ensure certificates match configured hostnames.

  • Improved certificate rotation logic with refined lifecycle management.

  • Tests

  • Expanded test coverage for certificate generation, validation, and rotation scenarios.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
provisioning/baremetal_secrets_test.go (1)

397-644: Good test coverage, but missing error scenario test.

The test cases comprehensively cover various SAN construction scenarios including HyperShift mode, external IPs, API VIPs, and edge cases like empty IPs. However, no test case exercises the expectError: true path (e.g., when getServerInternalIPs fails).

Consider adding a test case that triggers the error path in buildTlsHosts to ensure proper error propagation:

🧪 Example error scenario test case
{
    name: "error-from-api-server-ips",
    info: &ProvisioningInfo{
        Namespace: "openshift-machine-api",
        ProvConfig: &metal3iov1alpha1.Provisioning{
            Spec: metal3iov1alpha1.ProvisioningSpec{
                ProvisioningNetwork: metal3iov1alpha1.ProvisioningNetworkDisabled,
            },
        },
        OSClient: fakeconfigclientset.NewSimpleClientset(), // No Infrastructure object
    },
    expectError: true,
},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@provisioning/baremetal_secrets_test.go` around lines 397 - 644, Add a test
case to exercise the error path of buildTlsHosts by creating a ProvisioningInfo
that forces getServerInternalIPs to fail (e.g.,
ProvConfig.Spec.ProvisioningNetwork =
metal3iov1alpha1.ProvisioningNetworkDisabled and OSClient set to an empty
fakeconfigclientset with no Infrastructure object) and set expectError: true;
this new case should reference buildTlsHosts and ProvisioningInfo so the test
asserts an error is returned instead of comparing SANs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@provisioning/baremetal_secrets_test.go`:
- Around line 397-644: Add a test case to exercise the error path of
buildTlsHosts by creating a ProvisioningInfo that forces getServerInternalIPs to
fail (e.g., ProvConfig.Spec.ProvisioningNetwork =
metal3iov1alpha1.ProvisioningNetworkDisabled and OSClient set to an empty
fakeconfigclientset with no Infrastructure object) and set expectError: true;
this new case should reference buildTlsHosts and ProvisioningInfo so the test
asserts an error is returned instead of comparing SANs.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: db50c48b-5ff7-4f89-9938-cc34ec1e597e

📥 Commits

Reviewing files that changed from the base of the PR and between aff20f1 and 1176936.

📒 Files selected for processing (5)
  • provisioning/baremetal_crypto.go
  • provisioning/baremetal_crypto_test.go
  • provisioning/baremetal_secrets.go
  • provisioning/baremetal_secrets_test.go
  • provisioning/utils.go

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Mar 19, 2026

@mabulgu: This pull request references METAL-1715 which is a valid jira issue.

Details

In response to this:

Summary

  • Change TLS certificate validity from 2 years to 1 year and rotation trigger from 180 days to 30 days before expiration
  • Add unit tests verifying the exact validity period and rotation boundary conditions

Background

Per METAL-1687, the Ironic TLS certificate must be valid for exactly 1 year and automatically rotated 30 days before expiration. The METAL-1713 spike confirmed the existing custom rotation mechanism (Option C) is already sound — only the timing constants needed adjustment.

Parameter Before After
tlsExpiration 2 years (730 days) 1 year (365 days)
tlsRefresh 180 days 30 days

Dependencies

This PR depends on #569 (METAL-1714: Populate complete SANs in Ironic TLS certificate). Please merge that PR first, then this branch will need a rebase.

Test plan

  • TestCertificateValidityPeriod verifies generated certificates are valid for ~365 days
  • TestCertificateRotationBoundary covers 5 scenarios: rotation triggers at 29 and 30 days before expiration, no rotation at 31, 180, or 365 days
  • All existing tests pass with the new values
  • go test ./... passes
  • Tested on real cluster with accelerated expiration (short-lived cert)

Addresses: METAL-1715

Assisted-by: Claude 4.6 Opus High

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
provisioning/baremetal_crypto_test.go (1)

305-305: Use tlsExpiration/tlsRefresh constants in tests to avoid policy drift.

Line 305 and the table cases hardcode 365/30/180-day values that are already defined in production constants. Reusing those constants will keep tests aligned if policy changes again.

Suggested refactor
-	expectedValidity := 365 * 24 * time.Hour
+	expectedValidity := tlsExpiration
@@
-			certLifetime:   29 * 24 * time.Hour,
+			certLifetime:   tlsRefresh - 24*time.Hour,
@@
-			certLifetime:   30 * 24 * time.Hour,
+			certLifetime:   tlsRefresh,
@@
-			certLifetime:   31 * 24 * time.Hour,
+			certLifetime:   tlsRefresh + 24*time.Hour,
@@
-			certLifetime:   180 * 24 * time.Hour,
+			certLifetime:   tlsRefresh + 150*24*time.Hour,
@@
-			certLifetime:   365 * 24 * time.Hour,
+			certLifetime:   tlsExpiration,

As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."

Also applies to: 319-340

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@provisioning/baremetal_crypto_test.go` at line 305, The test hardcodes
duration values (e.g., expectedValidity := 365 * 24 * time.Hour and several
table-case values) which should instead reference the production constants
tlsExpiration and tlsRefresh to avoid policy drift; update expectedValidity and
the affected table cases to use tlsExpiration and tlsRefresh (or tlsRefresh/
other named constants) directly, making sure to respect their types (if they are
time.Duration use them as-is; if they are day counts multiply by time.Hour*24)
and run the test to verify types align.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@provisioning/baremetal_crypto_test.go`:
- Around line 323-326: The test relies on wall-clock timing and doesn't exercise
the exact equality boundary used by isTlsCertificateExpired (which checks
NotAfter.Before(refreshAfter)), so change the expiry check to accept an injected
reference time (add a now/time parameter to isTlsCertificateExpired or overload
it) and use that injected now in computing refreshAfter; then update the tests
"rotation-triggered-at-30-days" and the similar case around lines 349-352 in
provisioning/baremetal_crypto_test.go to pass a deterministic now value and
explicitly assert the equality case (now == tlsRefresh) and the before/after
cases to verify strict '<' behavior. Ensure you update all callers of
isTlsCertificateExpired (or provide a wrapper) so production call sites default
to time.Now() while tests supply the fixed timestamp.

---

Nitpick comments:
In `@provisioning/baremetal_crypto_test.go`:
- Line 305: The test hardcodes duration values (e.g., expectedValidity := 365 *
24 * time.Hour and several table-case values) which should instead reference the
production constants tlsExpiration and tlsRefresh to avoid policy drift; update
expectedValidity and the affected table cases to use tlsExpiration and
tlsRefresh (or tlsRefresh/ other named constants) directly, making sure to
respect their types (if they are time.Duration use them as-is; if they are day
counts multiply by time.Hour*24) and run the test to verify types align.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 647af12d-8322-45e8-b853-538d8bbb99ba

📥 Commits

Reviewing files that changed from the base of the PR and between 1176936 and b71de7a.

📒 Files selected for processing (2)
  • provisioning/baremetal_crypto.go
  • provisioning/baremetal_crypto_test.go
✅ Files skipped from review due to trivial changes (1)
  • provisioning/baremetal_crypto.go

@mabulgu
Copy link
Copy Markdown
Contributor Author

mabulgu commented Mar 19, 2026

/refresh-required

@mabulgu
Copy link
Copy Markdown
Contributor Author

mabulgu commented Mar 19, 2026

/test e2e-agnostic-ovn

The Ironic TLS certificate was previously valid for 2 years with
rotation triggering at 180 days before expiration. Per METAL-1687,
update these to 1-year validity and 30-day rotation window.

This uses the existing custom rotation mechanism (Option C from the
spike) which was already sound — only the timing constants needed
adjustment.

Addresses METAL-1715.

Made-with: Cursor
@mabulgu
Copy link
Copy Markdown
Contributor Author

mabulgu commented Mar 23, 2026

/test e2e-metal-ipi-ovn-ipv6

@elfosardo
Copy link
Copy Markdown
Contributor

/approve

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 23, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: elfosardo, mabulgu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 23, 2026
@iurygregory
Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 23, 2026
@mabulgu
Copy link
Copy Markdown
Contributor Author

mabulgu commented Mar 24, 2026

Hi @jadhaj. if you can mark this as "verified later" like you did for #569 I will apreciate it.

@jadhaj
Copy link
Copy Markdown

jadhaj commented Mar 30, 2026

/verified later @jadhaj

@openshift-ci-robot openshift-ci-robot added verified-later verified Signifies that the PR passed pre-merge verification criteria labels Mar 30, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@jadhaj: This PR has been marked to be verified later by @jadhaj.

Details

In response to this:

/verified later @jadhaj

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 773f22f and 2 for PR HEAD 1cd18e7 in total

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 30, 2026

@mabulgu: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-metal-ipi-serial-ipv4 1cd18e7 link true /test e2e-metal-ipi-serial-ipv4

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 30, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 30, 2026

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. verified Signifies that the PR passed pre-merge verification criteria verified-later

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants