fix: preserve benchmark data across mkdocs deployments #50

thomasht86 · 2026-01-15T09:47:24Z

Summary

Update mkdocs.yml to save and restore benchmark files before deploying to gh-pages
Update CI.yml benchmark job to push benchmark image to gh-pages instead of main
Update README to reference benchmark image from gh-pages URL

Problem

MkDocs uses --force which wipes the entire gh-pages branch, deleting accumulated benchmark data
CI benchmark job tried to push to main, but was blocked by branch protection (github-actions[bot] not in bypass list)

Solution

MkDocs now preserves dev/bench/data.js, dev/bench/index.html, and benchmark.jpg from gh-pages before deploying, then restores them after copying new docs
Benchmark image is now pushed to gh-pages (unprotected) instead of main
README references the gh-pages URL for the benchmark image

Test plan

Merge this PR
Verify mkdocs workflow completes successfully
Verify benchmark workflow adds new data point after mkdocs
Check https://thomasht86.github.io/httpr/dev/bench/ shows accumulated data points
Verify README benchmark image loads from gh-pages URL

🤖 Generated with Claude Code

- Update mkdocs.yml to save and restore benchmark files (data.js, index.html, benchmark.jpg) before deploying to gh-pages - Update CI.yml benchmark job to push benchmark image to gh-pages instead of main (avoids branch protection issues) - Update README to reference benchmark image from gh-pages URL This ensures benchmark data accumulates over time instead of being wiped on each mkdocs deployment. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

claude · 2026-01-15T09:49:37Z

Pull Request Review

Summary

This PR addresses a critical issue where MkDocs deployment was wiping accumulated benchmark data from the gh-pages branch, and fixes the benchmark image deployment which was blocked by branch protection rules. The solution is well-thought-out and addresses the root causes effectively.

✅ Strengths

Clear Problem Identification: The PR body clearly explains both issues:
- MkDocs --force wiping benchmark data
- Branch protection blocking pushes to main
Appropriate Solution: Moving benchmark image deployment to gh-pages is the right approach since it is an unprotected branch and logically belongs with the docs.
Data Preservation Logic: The backup/restore mechanism in mkdocs.yml properly handles three files (data.js, index.html, benchmark.jpg) using git show to extract from gh-pages history.
Idempotency: Both workflows handle edge cases well with conditional checks (|| true, diff checks before commit).

🔍 Issues & Concerns

Critical Issue: Git Reference Bug in mkdocs.yml:80

The commit message references HEAD after checking out gh-pages, which means it captures the previous gh-pages commit hash rather than the source commit from main/master being deployed.

Current code (line 79-80):

git commit -m "Deployed $(git rev-parse --short HEAD~0) with MkDocs"

Problem: After git checkout gh-pages, HEAD points to the gh-pages branch tip, not the commit being deployed.

Fix: Capture the source commit BEFORE switching branches:

# Add this before line 36 (before git fetch)
SOURCE_COMMIT=$(git rev-parse --short HEAD)

# Then on line 79, use:
git commit -m "Deployed ${SOURCE_COMMIT} with MkDocs"

Minor: Incomplete Error Handling

mkdocs.yml:59 - If mkdocs build fails silently, cp -r site/* . could copy nothing and break deployment.

Suggestion: Add validation:

if [ \! -d "site" ] || [ -z "$(ls -A site)" ]; then
  echo "Error: site/ directory is empty or missing"
  exit 1
fi
cp -r site/* .

Minor: Race Condition Risk

If both workflows run on a tag push to main, there is a small risk that mkdocs could restore an old benchmark.jpg after the benchmark workflow generates a new one.

Likelihood: Low, since triggers are different (main push vs tag), but worth documenting.

Mitigation: Add a comment explaining the ordering dependency, or use workflow needs if they are ever combined.

Code Quality: Redundant Checkout Depth

mkdocs.yml:14-15 - fetch-depth: 0 fetches full history but is not needed since you do git fetch origin gh-pages:gh-pages later. Consider removing unless needed for other reasons.

🔒 Security Considerations

✅ Good: Uses GitHub bot credentials consistently
✅ Good: No user input in scripts (no injection risks)
✅ Good: Force push limited to gh-pages (low risk)

⚠️ Consider: git rm -rf . on line 56 is destructive. If the subsequent cp -r site/* . fails, you are left with an empty gh-pages branch. The validation suggestion above would mitigate this.

📊 Testing & Coverage

The test plan in the PR description is appropriate and covers the key scenarios.

Suggestion: Explicitly test the first-deployment case (when gh-pages does not exist yet). The || true and || git checkout --orphan should handle it, but confirming would be valuable.

⚡ Performance

mkdocs.yml: Using git show for backup is efficient for small files - good choice.

CI.yml: Using /tmp for image transfer is appropriate for CI environments.

📝 Documentation

README.md: The image URL change is correct. Consider adding alt text for accessibility:

\![Benchmark Results](https://thomasht86.github.io/httpr/benchmark.jpg)

🎯 Recommendations

Fix the git reference bug (critical) - capture source commit before switching branches
Add validation for site/ directory existence before cp (safety)
Consider removing fetch-depth: 0 (minor optimization)
Document workflow ordering if both can run on tag push (risk mitigation)
Add alt text to benchmark image in README (accessibility)

✨ Verdict

Approve with requested changes. The approach is sound and solves real problems effectively. The git reference issue should be fixed, but it does not break functionality - just makes the commit message misleading. The other suggestions are nice-to-haves that improve robustness.

Great work identifying and fixing the MkDocs force-push issue! 🚀

- Fix git reference bug: capture source commit BEFORE switching to gh-pages branch (was showing gh-pages commit in message) - Add validation for site/ directory before copying to prevent deploying empty content if mkdocs build fails Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

claude · 2026-01-15T10:25:47Z

Pull Request Review - Follow-up Analysis

I've reviewed PR #50 and notice there was already a comprehensive review provided. Looking at the current state of the code, I can see that the critical git reference bug has been fixed - the code now correctly captures SOURCE_COMMIT before switching branches (line 37 in mkdocs.yml) and uses it in the commit message (line 86).

Let me provide additional observations on the current implementation:

✅ Resolved Issues

Git Reference Bug FIXED: SOURCE_COMMIT is now captured at line 37 before branch switching
Site Directory Validation ADDED: Lines 62-65 now check if site/ exists and is non-empty before copying
Proper Error Handling: Both workflows have appropriate || true fallbacks and conditional checks

🎯 Additional Observations

1. Workflow Execution Order

The current setup has two workflows that could interact:

mkdocs.yml: Triggers on push to main/master
CI.yml benchmark: Triggers on tag creation (refs/tags/*)

Potential race condition: If a tag is pushed that points to a commit on main, both workflows trigger simultaneously. The benchmark job could write benchmark.jpg to gh-pages while mkdocs is restoring the old version.

Recommendation: Consider adding a workflow dependency or documenting the expected execution order. Since tags are typically pushed separately from branch commits, the race window is small but non-zero.

2. fetch-depth: 0 in mkdocs.yml

Line 15 sets fetch-depth: 0 (full history), but the workflow only needs:

Current commit for SOURCE_COMMIT
gh-pages branch (fetched separately at line 40)

Impact: Minor performance overhead fetching unnecessary history.

Recommendation: Remove fetch-depth: 0 unless needed for other purposes (mkdocs plugins, etc.). Default shallow clone is sufficient.

3. Benchmark Data Persistence Strategy

The current approach uses git show gh-pages:path to extract files. This is efficient but has one edge case:

If someone manually deletes these files from gh-pages, they won't be restored even if they exist in the working tree after mkdocs build.

Current behavior: Only restores from gh-pages history, never from newly built docs.

Is this correct? Probably yes - the benchmark data is generated by CI, not mkdocs. But worth confirming this is intentional.

4. Error Message Clarity

Line 63-64 error message says "site/ directory is empty or missing" but exits with status 1, which might not clearly indicate why the workflow failed in GitHub Actions logs.

Minor suggestion: Add more context to help debug failures faster.

🔒 Security Review

✅ All git operations use proper bot credentials
✅ No external input in scripts (no injection vectors)
✅ Force push limited to unprotected gh-pages branch
✅ File operations use safe paths (/tmp/bench-backup, site/)

One consideration: The git rm -rf . on line 59 removes all tracked files before the site/ validation on line 62. If validation fails after this point, gh-pages would be in a broken state (though force-push would retry on next run).

Safer order would be to validate site/ exists before removing files from gh-pages. This ensures you never have an empty gh-pages branch due to a validation failure.

📊 Testing Recommendations

Beyond the test plan in the PR description, consider testing:

First deployment (no gh-pages branch exists): Should create orphan branch
Missing benchmark files: Should deploy successfully without errors
mkdocs build failure: Should exit cleanly without corrupting gh-pages
Concurrent workflow execution: If both workflows run simultaneously on a tag

⚡ Performance Notes

Good choices:

Using git show for small file extraction (efficient)
Using /tmp for temporary storage (RAM-backed in CI)
Conditional commits (avoids no-op pushes)

Benchmark image format: The generate_image.py saves as JPG at 80 DPI. Consider PNG for better quality (charts have sharp lines) or higher DPI for retina displays. Current approach prioritizes loading speed which is reasonable.

📝 Documentation Suggestions

Add workflow comments explaining the benchmark data preservation strategy
Document workflow ordering in CI.yml and mkdocs.yml headers
README.md: Consider adding alt text for accessibility

🎯 Final Verdict

LGTM with minor suggestions ✅

The critical issues from the previous review have been addressed. The remaining suggestions are optimizations and edge case handling that would improve robustness but aren't blockers.

Required: None

Recommended:

Move site/ validation before git rm -rf . (safety)
Consider removing fetch-depth: 0 (performance)

Nice-to-have:

Document workflow execution order
Add more detailed error messages
Consider image format/quality for benchmarks

Great work fixing the MkDocs data preservation issue! This is a solid solution to a complex problem. 🚀

thomasht86 enabled auto-merge January 15, 2026 10:24

thomasht86 merged commit 6e9ffd4 into main Jan 15, 2026
19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: preserve benchmark data across mkdocs deployments #50

fix: preserve benchmark data across mkdocs deployments #50

Uh oh!

thomasht86 commented Jan 15, 2026

Uh oh!

claude bot commented Jan 15, 2026

Uh oh!

claude bot commented Jan 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: preserve benchmark data across mkdocs deployments #50

fix: preserve benchmark data across mkdocs deployments #50

Uh oh!

Conversation

thomasht86 commented Jan 15, 2026

Summary

Problem

Solution

Test plan

Uh oh!

claude bot commented Jan 15, 2026

Pull Request Review

Summary

✅ Strengths

🔍 Issues & Concerns

Critical Issue: Git Reference Bug in mkdocs.yml:80

Minor: Incomplete Error Handling

Minor: Race Condition Risk

Code Quality: Redundant Checkout Depth

🔒 Security Considerations

📊 Testing & Coverage

⚡ Performance

📝 Documentation

🎯 Recommendations

✨ Verdict

Uh oh!

claude bot commented Jan 15, 2026

Pull Request Review - Follow-up Analysis

✅ Resolved Issues

🎯 Additional Observations

1. Workflow Execution Order

2. fetch-depth: 0 in mkdocs.yml

3. Benchmark Data Persistence Strategy

4. Error Message Clarity

🔒 Security Review

📊 Testing Recommendations

⚡ Performance Notes

📝 Documentation Suggestions

🎯 Final Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants