Skip to content

Commit 46bfbfa

Browse files
authored
chore(specs): make CI health a hard gate in maintenance checklist (everruns#1092)
## Summary - Rename "Nightly CI" → "CI Health" in maintenance spec and skill - Add "CI on main is green" as a required check (was missing — only nightly/fuzz were checked) - Mark the section as a **hard gate** — maintenance pass cannot complete while red - Expand escalation policy to cover all workflows (CI, nightly, fuzz) - Add concrete `gh` commands for inspecting CI on main to the maintain skill - Add "never silently skip" instruction for the agent ## Context The maintenance pass (PR everruns#1063, Apr 5) shipped while: - CI on main was red for 9 days (`cargo vet` missing `fastrand:2.4.0` cert, everruns#1091) - Fuzz had 5 failures across 3 distinct bugs (everruns#1088, everruns#1089, everruns#1090) Root cause: the spec and skill only checked nightly/fuzz workflows, not the main CI workflow, and had no hard-gate language preventing merge while red. ## Test plan - [x] Spec changes are documentation-only, no code impact - [x] Verify maintain skill section 10 now covers CI on main, nightly, and fuzz - [x] Verify escalation policy applies to all three workflow types
1 parent 76726f4 commit 46bfbfa

File tree

4 files changed

+41
-15
lines changed

4 files changed

+41
-15
lines changed

.claude/commands/maintain.md

Lines changed: 23 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -74,13 +74,29 @@ Make the simplifications. Run tests after each change. The goal is less code tha
7474

7575
`AGENTS.md` and `CLAUDE.md` reflect current specs, commands, tooling, and workflows.
7676

77-
### 10. Nightly CI is healthy
78-
79-
Nightly and fuzz workflows green for past week. Fuzz targets compile. Git-sourced deps resolve.
80-
81-
Key tools: `gh run list --workflow=nightly.yml --limit 7`, `gh run list --workflow=fuzz.yml --limit 7`
82-
83-
If failures persist >2 days, escalate per the policy in `specs/012-maintenance.md`.
77+
### 10. All CI is healthy (HARD GATE)
78+
79+
**This section is a blocker.** The maintenance pass MUST NOT be marked complete
80+
while any of these checks are red.
81+
82+
1. **CI on main is green** — check the latest CI run on the `main` branch. If
83+
any job (Audit, Test, Lint, Examples, Fuzz Compile Check) fails, fix it
84+
before proceeding. Common failures: `cargo vet` missing certifications,
85+
dependency audit advisories, clippy warnings.
86+
2. **Nightly workflow green** for past 7 days.
87+
3. **Fuzz workflow green** for past 7 days. If a fuzz target crashes, open a
88+
GitHub issue with the crash artifact, reproduction command, and base64 input.
89+
4. Fuzz targets compile. Git-sourced deps resolve.
90+
91+
Key tools:
92+
- `gh run list --workflow=ci.yml --branch=main --limit 5` (CI on main)
93+
- `gh run list --workflow=nightly.yml --limit 7` (nightly)
94+
- `gh run list --workflow=fuzz.yml --limit 7` (fuzz)
95+
- `gh api repos/OWNER/REPO/actions/runs/RUN_ID/jobs` (inspect failed jobs)
96+
97+
If failures persist >2 days, escalate per `specs/012-maintenance.md`.
98+
If the agent cannot fix a failure, it MUST open a GitHub issue and report the
99+
pass as blocked — never silently skip.
84100

85101
## Execution
86102

specs/012-maintenance.md

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -111,20 +111,28 @@ dependency rot, or security gaps ship in a release.
111111
- Build/test commands work
112112
- Pre-PR checklist covers current tooling
113113

114-
### Nightly CI
114+
### CI Health
115115

116+
- **CI on main is green** — the latest CI run on `main` must pass. Any failure
117+
(audit, test, lint, examples) is a blocker that must be fixed before
118+
proceeding with the rest of the maintenance pass.
116119
- Nightly and fuzz workflows green for past week
117120
- Fuzz targets compile
118121
- Git-sourced dependencies still resolve
119122

120-
#### Nightly Escalation Policy
123+
#### Escalation Policy
121124

122-
Failures persisting **>2 consecutive days** are blocking:
125+
Failures persisting **>2 consecutive days** on any workflow (CI, nightly, fuzz)
126+
are blocking:
123127
1. Open GitHub issue with label `ci:nightly`
124128
2. Link failing run(s)
125129
3. Assign to most recent contributor in failing area
126130
4. If upstream dep change: pin to known-good rev, open follow-up issue
127131

132+
**This section is a hard gate.** The maintenance pass MUST NOT be marked
133+
complete or merged while any of the above checks are red. If the agent cannot
134+
fix a failure, it must open a GitHub issue and report the pass as blocked.
135+
128136
## Deferred Items
129137

130138
When a maintenance pass identifies issues too large to fix inline (e.g.
@@ -149,7 +157,8 @@ Sections dependencies, tests, examples, code quality, and nightly CI are fully
149157
automatable. Security, documentation, specs, simplification, and agent config
150158
require human or agent review.
151159

152-
Nightly check enforced by `just check-nightly`, called by `just release-check`.
160+
CI health check enforced by `just check-nightly` (nightly + fuzz) and manual
161+
inspection of CI on `main` (audit, test, lint). Called by `just release-check`.
153162

154163
## Invocation
155164

supply-chain/audits.toml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,11 @@ who = "Mykhailo Chalyi <mike@chaliy.name>"
66
criteria = "safe-to-deploy"
77
version = "0.39.1"
88

9+
[[audits.fastrand]]
10+
who = "Mykhailo Chalyi <mike@chaliy.name>"
11+
criteria = "safe-to-deploy"
12+
version = "2.4.0"
13+
914
[[audits.hybrid-array]]
1015
who = "Mykhailo Chalyi <mike@chaliy.name>"
1116
criteria = "safe-to-deploy"

supply-chain/config.toml

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -490,10 +490,6 @@ criteria = "safe-to-deploy"
490490
version = "0.17.0"
491491
criteria = "safe-to-deploy"
492492

493-
[[exemptions.fastrand]]
494-
version = "2.4.0"
495-
criteria = "safe-to-deploy"
496-
497493
[[exemptions.ff]]
498494
version = "0.13.1"
499495
criteria = "safe-to-deploy"

0 commit comments

Comments
 (0)