ci: cache vivisect workspaces between CI runs to speed up tests#2881
ci: cache vivisect workspaces between CI runs to speed up tests#2881mr-tz merged 3 commits intomandiant:masterfrom
Conversation
CI always starts cold because *.viv files are in .gitignore and never committed. Benchmark on mimikatz (112 viv feature tests): 8.65s WITH cache vs 118.95s WITHOUT — roughly 14x speedup. Add a GitHub Actions cache step that persists tests/data/**/*.viv between runs. Cache key includes OS, architecture, Python version, and a hash of requirements.txt so it auto-invalidates when the vivisect version is bumped. On a cache miss tests fall back to cold execution normally; the cache is a pure optimization with no hard dependency. fixes mandiant#556
|
Note Gemini is unable to generate a summary for this pull request due to the file types involved not being currently supported. |
There was a problem hiding this comment.
Please add bug fixes, new features, breaking changes and anything else you think is worthwhile mentioning to the master (unreleased) section of CHANGELOG.md. If no CHANGELOG update is needed add the following to the PR description: [x] No CHANGELOG update needed
CHANGELOG updated or no update needed, thanks! 😄
|
Hi @mr-tz @mike-hunhoff , I tried to run the workflow on my fork to produce a cold vs warm benchmark (run 1 = cache miss, run 2 = cache hit) before asking for review, but the workflow only triggers on push: branches: [master] and pull_request: branches: [master], so no runs fired on my fork's branch. The local benchmark numbers in the description (8.65s warm vs 118.95s cold) demonstrate the mechanism works, but I understand you may want to see actual CI timing from GitHub runners. Could you advise the best way to verify this — should I add a temporary workflow_dispatch trigger to allow a manual run on the fork, or would you prefer to just approve the workflows here on the PR and compare run 1 vs run 2 timings directly? |
.github/workflows/tests.yml
Outdated
| pip install -r requirements.txt | ||
| pip install -e .[dev,scripts] | ||
| - name: Cache vivisect workspaces | ||
| uses: actions/cache@v4 |
There was a problem hiding this comment.
let's use the pinned version like we do for other actions
|
one minor comment, then I think we can merge and see how this behaves |
aa58981 to
ed34ffd
Compare
|
@mr-tz Done and awaiting for your review |
|
Thanks, we'll see how this helps now. |
Summary
Fixes #556
CI always starts cold because
*.vivworkspace files are in.gitignoreand never committed. This means vivisect re-analyzes every binary on every CI run.This PR adds a
actions/cache@v4step to thetestsjob that persiststests/data/**/*.vivbetween runs, keyed on OS + architecture + Python version +hash(requirements.txt).How it works
.vivfiles are restored before pytest starts → vivisect skips re-analysis.vivfiles, cache is saved for next runhashFiles('**/requirements.txt')so it auto-invalidates when the vivisect version changes — no risk of stale workspacesBenchmark (local, mimikatz — 112 viv feature tests)
.viv, same as CI today).vivcached)CI benchmark from this PR (run 1 cold vs run 2 warm) will be posted once workflows are approved.
Checklist