Skip to content

Add temporary diagnostics convergence validation to virtual-integration workflow#1465

Draft
Copilot wants to merge 8 commits intomainfrom
copilot/add-diagnostics-convergence-validation
Draft

Add temporary diagnostics convergence validation to virtual-integration workflow#1465
Copilot wants to merge 8 commits intomainfrom
copilot/add-diagnostics-convergence-validation

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Feb 18, 2026

Embeds diagnostics convergence validation as a job in the existing virtual-integration.yml workflow for testing before creating a separate workflow file (new workflows aren't visible in Actions until merged).

Changes

New Script: ci/diagnostics_convergence.py

Analyzes diagnostics artifacts across workflow runs to identify error patterns:

  • Persistent errors (>50% occurrence rate) vs transient errors
  • Convergence trends (improving/degrading/stable)
  • Outputs: convergence.json, convergence.md, manifest.json

New Job: diagnostics-convergence-validation

  • Dependencies: deploy-kind, deploy-on-prem, deploy-oxm-profile
  • Conditional execution: if: always() with early exit when all target jobs succeed
  • Permissions: actions: read, contents: read (minimal)
  • Workflow:
    1. Early exit check (skips if all target jobs succeeded)
    2. Lists last 20 completed workflow runs via github-script
    3. Downloads diagnostics-<job>-... artifacts from target jobs
    4. Runs convergence analyzer on extracted diagnostics JSON files
    5. Uploads convergence outputs and writes summary to $GITHUB_STEP_SUMMARY

Example Usage

After a failed deploy job, the convergence report shows:

## Persistent Errors (1 patterns)
### default/api-server:CrashLoopBackOff
- Occurrence Rate: 85.7% (12/14 runs)
- First Seen: 2026-02-10T08:00:00Z

## Convergence Trend
Overall Trend: DEGRADING
- Average errors (first half): 2.3
- Average errors (second half): 4.1

Cleanup

Remove diagnostics-convergence-validation job after validation. The script can remain for future use.


Note: Job is clearly marked as TEMPORARY in comments and job name.

Original prompt

Create a TEMPORARY validation PR (separate from PR #1459) in repo open-edge-platform/edge-manageability-framework that embeds the diagnostics convergence logic into the existing workflow .github/workflows/virtual-integration.yml so it can be validated in Actions even when new workflow files are not visible.

Requirements:

  • Add a new job (e.g., diagnostics-convergence-validation) to .github/workflows/virtual-integration.yml.

  • The job should use if: always() so it is visible on every run, BUT it must do an early exit unless at least one of these jobs is non-success (failed/cancelled/timed_out):

    • deploy-kind
    • deploy-on-prem
    • deploy-oxm-profile
  • Use minimal permissions required for this job: actions: read, contents: read.

  • The job should:

    1. checkout the repo
    2. setup python
    3. use actions/github-script to list the last N=20 runs of the Virtual Integration workflow (no branch/event filter by default; keep it open for future)
    4. download artifacts named diagnostics-<job>-... for those three jobs (same selection rules as PR Add automated diagnostics convergence workflow for failure pattern analysis #1459)
    5. run the convergence analyzer script from PR Add automated diagnostics convergence workflow for failure pattern analysis #1459 (ci/diagnostics_convergence.py) if present; if not present on this branch, include a minimal inline equivalent or add the script as part of this temporary PR (prefer reusing the same script code to keep validation realistic)
    6. upload the convergence outputs (convergence.json, convergence.md, manifest.json) as an artifact
    7. write a short summary to $GITHUB_STEP_SUMMARY
  • The early-exit behavior should not mark the workflow as failed; it should simply log that convergence was skipped because the target jobs succeeded.

Notes:

  • Since this is a temporary validation PR, clearly label the job name and/or add comments in YAML indicating it should be removed after validation.
  • Ensure the new job depends on deploy-kind, deploy-on-prem, deploy-oxm-profile via needs: so it can read their results in if: conditions and so it runs after they complete.
  • Keep runtime reasonable (avoid downloading artifacts if skipping).

Deliverable:

  • Open a PR against main with these changes.
  • Keep the PR scope limited to the validation job and any required helper script additions.

This pull request was created from Copilot chat.


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 2 commits February 18, 2026 16:46
…on workflow

Co-authored-by: hwindlas <108932456+hwindlas@users.noreply.github.com>
Co-authored-by: hwindlas <108932456+hwindlas@users.noreply.github.com>
Copilot AI changed the title [WIP] Add diagnostics convergence validation job to workflow Add temporary diagnostics convergence validation to virtual-integration workflow Feb 18, 2026
Copilot AI requested a review from hwindlas February 18, 2026 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants