Problem
The shell-script-editing infra fix process library uses a single one-shot HTTP check (curl) immediately after applying a fix to a shell script in order to verify the fix worked. This snapshot verification is fundamentally flawed: it interrogates the server that was already running before the fix was applied, not the server as it will behave under the fixed code on its next scheduled execution cycle.
The process reports COMPLETED / FIXED, but the fix may be broken and will only reveal itself when the next system cycle fires (e.g., a launchd health monitor running every 5 minutes).
Root Cause
The verification step performs a single curl request immediately after patching the script. Because the server process is still alive from the pre-patch run, the HTTP check succeeds regardless of whether the patched code is correct. The check proves only that the server is still up, not that the fix is safe to run.
Key gap: there is no dry-run / static analysis of the patched script before application, no wait-and-re-check across at least one full monitor cycle, and no post-cycle survival check after the patched code has actually executed end-to-end.
Evidence
Real-world reproduction in the trip-planner project:
- The
fix-stale-sync babysitter process added a call to check_sync_freshness() on the healthy-server code path inside the health-monitor shell script.
check_sync_freshness() internally calls sync_project(), which executes rm -rf $DEV_DIR (atomic directory swap) while the server is running and serving from that directory.
- The single-
curl snapshot check ran immediately after the patch was written. The server was still alive from its pre-patch state — check passed, process reported COMPLETED.
- Five minutes later, launchd fired the health monitor.
check_sync_freshness() executed, rm -rf $DEV_DIR deleted the live server directory, and the server crashed.
- This failure loop repeated across 5 consecutive runs, each declaring success, none surviving the next launchd cycle.
Expected Behavior
- After applying a fix to a shell script, the verification step must wait for at least one full monitor/cycle period to elapse and then confirm the server is still healthy.
- Ideally, a static analysis or dry-run pass is performed on the patched script before it is applied, catching destructive operations (e.g.,
rm -rf inside a live-server code path) before they can execute.
- The process must not report
COMPLETED until the server has survived a full cycle under the new code.
Actual Behavior
- A single
curl request is issued immediately after patching.
- The request succeeds because the server is still alive from before the fix.
- The process reports
COMPLETED / FIXED.
- On the next system cycle, the patched code executes, crashes the server, and the failure is never attributed to the fix.
Repro Steps
- Set up a project with a launchd-managed health monitor shell script that runs every 5 minutes.
- Use the infra fix process to patch the script, injecting a function call on the healthy-server code path.
- Ensure the injected function contains a destructive operation that would crash the server (e.g.,
rm -rf of the server's working directory).
- Observe that the single-
curl verification immediately after patching reports success.
- Wait 5 minutes (one launchd cycle).
- Observe the server has crashed; the fix declared success is actually broken.
Environment
| Field |
Value |
| babysitter |
0.0.187 |
| Platform |
macOS darwin |
| Node |
22.18.0 |
| Scheduler |
launchd (KeepAlive=true) |
| Monitor interval |
5 minutes |
Related Issues
Proposed Solution
- Static pre-flight analysis: Before applying any patch to a shell script, parse the diff for dangerous patterns (
rm -rf, rmdir, directory reassignment) in code paths that execute while the server is live. Abort with an actionable error if found.
- Post-cycle survival check: After applying the fix, wait for at least one full monitor cycle period (or a configurable grace window) before issuing the verification check. The process must not advance to
COMPLETED until the check passes after the patched code has had the opportunity to execute.
- Multi-point verification: Replace the single
curl snapshot with a minimum of two checks — one immediately (baseline) and one after the next system cycle — and require both to pass.
- Explicit completion gate: Gate the
COMPLETED / FIXED status behind the multi-point verification result, making it impossible to report success before the post-cycle check resolves.
Problem
The shell-script-editing infra fix process library uses a single one-shot HTTP check (
curl) immediately after applying a fix to a shell script in order to verify the fix worked. This snapshot verification is fundamentally flawed: it interrogates the server that was already running before the fix was applied, not the server as it will behave under the fixed code on its next scheduled execution cycle.The process reports
COMPLETED/FIXED, but the fix may be broken and will only reveal itself when the next system cycle fires (e.g., a launchd health monitor running every 5 minutes).Root Cause
The verification step performs a single
curlrequest immediately after patching the script. Because the server process is still alive from the pre-patch run, the HTTP check succeeds regardless of whether the patched code is correct. The check proves only that the server is still up, not that the fix is safe to run.Key gap: there is no dry-run / static analysis of the patched script before application, no wait-and-re-check across at least one full monitor cycle, and no post-cycle survival check after the patched code has actually executed end-to-end.
Evidence
Real-world reproduction in the trip-planner project:
fix-stale-syncbabysitter process added a call tocheck_sync_freshness()on the healthy-server code path inside the health-monitor shell script.check_sync_freshness()internally callssync_project(), which executesrm -rf $DEV_DIR(atomic directory swap) while the server is running and serving from that directory.curlsnapshot check ran immediately after the patch was written. The server was still alive from its pre-patch state — check passed, process reportedCOMPLETED.check_sync_freshness()executed,rm -rf $DEV_DIRdeleted the live server directory, and the server crashed.Expected Behavior
rm -rfinside a live-server code path) before they can execute.COMPLETEDuntil the server has survived a full cycle under the new code.Actual Behavior
curlrequest is issued immediately after patching.COMPLETED/FIXED.Repro Steps
rm -rfof the server's working directory).curlverification immediately after patching reports success.Environment
Related Issues
Proposed Solution
rm -rf,rmdir, directory reassignment) in code paths that execute while the server is live. Abort with an actionable error if found.COMPLETEDuntil the check passes after the patched code has had the opportunity to execute.curlsnapshot with a minimum of two checks — one immediately (baseline) and one after the next system cycle — and require both to pass.COMPLETED/FIXEDstatus behind the multi-point verification result, making it impossible to report success before the post-cycle check resolves.