Skip to content

format-patch transport cannot represent merge commits, causing PR creation failures for non-linear histories #24110

@dsyme

Description

@dsyme

Problem

The current patch transport mechanism in gh-aw uses git format-patch BASE..BRANCH --stdout to serialize agent commits, and git am --3way to apply them in the safe_outputs step. This works correctly for linear commit histories but silently drops merge commits, causing patch application failures when agents create non-linear histories that include merges.

git format-patch limitation

git format-patch was designed for emailing linear patch series. When it encounters a commit graph with merge commits, it silently omits them and outputs only the non-merge commits reachable from the branch tip. This means:

  • Merge conflict resolutions (which live inside merge commits) are lost
  • Commits from different merged branches may have incompatible diff contexts for the same files
  • git am then fails with CONFLICT errors when trying to apply patches that assume pre-merge file states

Real-world failure: Lean Squad Run #124

Run: https://github.com/dsyme/fv-squad/actions/runs/23810256669
Repository: dsyme/fv-squad

What the agent did (correctly)

The Lean Squad agent was instructed to merge open PRs before starting its own work. It:

  1. Started on main at base commit cfaa3a9
  2. Merged 3 open Lean Squad PR branches locally:
  3. Created its feature branch lean-squad/find-conflict-by-term-r118-23810256669
  4. Made its own commit (find_conflict_by_term spec + proofs)
  5. Called create_pull_request

This produced a commit graph with merge commits:

cfaa3a9 (origin/main)
   │
   ├─── merge-commit-1 (inflights PR merged cleanly)      ← 2 parents
   │
   ├─── merge-commit-2 (progress PR, conflicts resolved)  ← 2 parents
   │
   ├─── merge-commit-3 (is-up-to-date PR, resolved)       ← 2 parents
   │
   └─── agent commit (find_conflict_by_term)

What format-patch produced

git format-patch cfaa3a9..branch-tip --stdout serialized this as 6 linear patches, dropping all 3 merge commits. The patches came from the individual PR branch commits (which were reachable through the merge parents) plus the agent's own commit.

Patches 1 and 2 both had the same title (feat(fv): prove inflightsConc_add_correct...) and both modified CORRESPONDENCE.md and TARGETS.md, but each assumed the pre-merge (cfaa3a9) state of those files — the conflict resolutions that reconciled them were gone.

What happened in safe_outputs

Applying: feat(fv): prove inflightsConc_add_correct...    ← PATCH 1/6 ✓
Applying: feat(fv): prove inflightsConc_add_correct...    ← PATCH 2/6 ✗
CONFLICT (content): Merge conflict in formal-verification/CORRESPONDENCE.md
CONFLICT (content): Merge conflict in formal-verification/TARGETS.md
Patch failed at 0002

The fallback (re-creating branch at original base commit cfaa3a9 and applying without --3way) also failed.

Impact

Any agent workflow that merges branches locally — which is a natural and correct thing to do when building on top of pending PRs — will hit this limitation. The agent behavior was correct; the transport mechanism cannot handle it.

Options

1. git bundle transport

git bundle can represent arbitrary history including merge commits. The safe_outputs step would use git bundle unbundle + git merge or git pull to incorporate the changes. This preserves the full DAG.

2. Direct branch push

Skip serialization entirely — have the agent push its branch directly and create the PR from the pushed ref. This avoids the format-patch/am roundtrip but changes the security model of safe_outputs (the agent would need push access).

3. Rebase before patch generation

The generate_git_patch step could detect merge commits and automatically rebase the branch onto the base ref before running format-patch. This would linearize the history (baking merge resolutions into each rebased commit) at the cost of losing the original merge structure.

4. Squash before patch generation

Similar to rebase, but produce a single commit. Simplest but loses all commit granularity.

5. Detect and warn

At minimum, generate_git_patch could detect merge commits in the range and emit a warning or error, rather than silently producing a broken patch. This would give agents actionable feedback.

6. Agent-side workaround (current)

Instruct agents to rebase or squash their work before calling create_pull_request when they have merged other branches. This pushes the complexity to the agent prompt but works with the current infrastructure.

Recommendation

Option 5 (detect and warn) should be implemented regardless — silent failures are the worst outcome. Beyond that, Option 1 (git bundle) or Option 3 (automatic rebase) seem most practical for fully solving the problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions