-
Notifications
You must be signed in to change notification settings - Fork 322
format-patch transport cannot represent merge commits, causing PR creation failures for non-linear histories #24110
Description
Problem
The current patch transport mechanism in gh-aw uses git format-patch BASE..BRANCH --stdout to serialize agent commits, and git am --3way to apply them in the safe_outputs step. This works correctly for linear commit histories but silently drops merge commits, causing patch application failures when agents create non-linear histories that include merges.
git format-patch limitation
git format-patch was designed for emailing linear patch series. When it encounters a commit graph with merge commits, it silently omits them and outputs only the non-merge commits reachable from the branch tip. This means:
- Merge conflict resolutions (which live inside merge commits) are lost
- Commits from different merged branches may have incompatible diff contexts for the same files
git amthen fails with CONFLICT errors when trying to apply patches that assume pre-merge file states
Real-world failure: Lean Squad Run #124
Run: https://github.com/dsyme/fv-squad/actions/runs/23810256669
Repository: dsyme/fv-squad
What the agent did (correctly)
The Lean Squad agent was instructed to merge open PRs before starting its own work. It:
- Started on
mainat base commitcfaa3a9 - Merged 3 open Lean Squad PR branches locally:
- PR [copilot] output field #124 (inflights) — merged cleanly
- PR Implement output field for automatic GitHub issue creation with comprehensive documentation #125 (progress) — had conflicts in
CORRESPONDENCE.mdandTARGETS.md, resolved with--theirs - PR [copilot] Compiler: Upload GITHUB_AW_OUTPUT as workflow artifact "aw_output.txt" #126 (is-up-to-date) — same conflicts, resolved similarly
- Created its feature branch
lean-squad/find-conflict-by-term-r118-23810256669 - Made its own commit (find_conflict_by_term spec + proofs)
- Called
create_pull_request
This produced a commit graph with merge commits:
cfaa3a9 (origin/main)
│
├─── merge-commit-1 (inflights PR merged cleanly) ← 2 parents
│
├─── merge-commit-2 (progress PR, conflicts resolved) ← 2 parents
│
├─── merge-commit-3 (is-up-to-date PR, resolved) ← 2 parents
│
└─── agent commit (find_conflict_by_term)
What format-patch produced
git format-patch cfaa3a9..branch-tip --stdout serialized this as 6 linear patches, dropping all 3 merge commits. The patches came from the individual PR branch commits (which were reachable through the merge parents) plus the agent's own commit.
Patches 1 and 2 both had the same title (feat(fv): prove inflightsConc_add_correct...) and both modified CORRESPONDENCE.md and TARGETS.md, but each assumed the pre-merge (cfaa3a9) state of those files — the conflict resolutions that reconciled them were gone.
What happened in safe_outputs
Applying: feat(fv): prove inflightsConc_add_correct... ← PATCH 1/6 ✓
Applying: feat(fv): prove inflightsConc_add_correct... ← PATCH 2/6 ✗
CONFLICT (content): Merge conflict in formal-verification/CORRESPONDENCE.md
CONFLICT (content): Merge conflict in formal-verification/TARGETS.md
Patch failed at 0002
The fallback (re-creating branch at original base commit cfaa3a9 and applying without --3way) also failed.
Impact
Any agent workflow that merges branches locally — which is a natural and correct thing to do when building on top of pending PRs — will hit this limitation. The agent behavior was correct; the transport mechanism cannot handle it.
Options
1. git bundle transport
git bundle can represent arbitrary history including merge commits. The safe_outputs step would use git bundle unbundle + git merge or git pull to incorporate the changes. This preserves the full DAG.
2. Direct branch push
Skip serialization entirely — have the agent push its branch directly and create the PR from the pushed ref. This avoids the format-patch/am roundtrip but changes the security model of safe_outputs (the agent would need push access).
3. Rebase before patch generation
The generate_git_patch step could detect merge commits and automatically rebase the branch onto the base ref before running format-patch. This would linearize the history (baking merge resolutions into each rebased commit) at the cost of losing the original merge structure.
4. Squash before patch generation
Similar to rebase, but produce a single commit. Simplest but loses all commit granularity.
5. Detect and warn
At minimum, generate_git_patch could detect merge commits in the range and emit a warning or error, rather than silently producing a broken patch. This would give agents actionable feedback.
6. Agent-side workaround (current)
Instruct agents to rebase or squash their work before calling create_pull_request when they have merged other branches. This pushes the complexity to the agent prompt but works with the current infrastructure.
Recommendation
Option 5 (detect and warn) should be implemented regardless — silent failures are the worst outcome. Beyond that, Option 1 (git bundle) or Option 3 (automatic rebase) seem most practical for fully solving the problem.