Summary
The GitHub fetcher clones repositories from scratch on every mctl up invocation, even when the ref hasn't changed. Implement a persistent clone cache to avoid redundant network I/O, saving 5-30s per sync for large repositories.
Context
The GitHub fetcher in internal/fetchers/github.go creates a temp directory (line ~40), clones into it, extracts files, then deletes the temp dir (line ~44 via defer). There is no reuse of previously cloned data between syncs. For large repos (e.g., envoyproxy/gateway), this means re-downloading hundreds of MB on every sync even when nothing has changed.
Key files:
internal/fetchers/github.go — Fetch() method, os.MkdirTemp() + os.RemoveAll() pattern
internal/pipeline/pipeline.go — calls fetcher.Fetch() per dependency
Acceptance Criteria
Technical Approach
- Define cache directory structure:
~/.mycelium/cache/clones/<sha256(repo+ref)>/
- Before cloning, check if cache dir exists and is valid (contains
.git/)
- On cache hit: use cached directory directly for file extraction
- On cache miss: clone into cache dir instead of temp dir, skip
RemoveAll
- Add a
mctl cache clean command or automatic TTL-based cleanup
- For branch refs (not pinned commits), consider
git fetch to update rather than full re-clone
Dependencies
None — standalone improvement.
Out of Scope
- Concurrent fetching of multiple deps (separate issue)
- Shallow clone optimization (could be a follow-up)
Summary
The GitHub fetcher clones repositories from scratch on every
mctl upinvocation, even when the ref hasn't changed. Implement a persistent clone cache to avoid redundant network I/O, saving 5-30s per sync for large repositories.Context
The GitHub fetcher in
internal/fetchers/github.gocreates a temp directory (line ~40), clones into it, extracts files, then deletes the temp dir (line ~44 via defer). There is no reuse of previously cloned data between syncs. For large repos (e.g., envoyproxy/gateway), this means re-downloading hundreds of MB on every sync even when nothing has changed.Key files:
internal/fetchers/github.go—Fetch()method,os.MkdirTemp()+os.RemoveAll()patterninternal/pipeline/pipeline.go— callsfetcher.Fetch()per dependencyAcceptance Criteria
~/.mycelium/cache/clones/<source-hash>/)git cloneentirelymctl upwith a warm cache is measurably faster (no network I/O for cached deps)Technical Approach
~/.mycelium/cache/clones/<sha256(repo+ref)>/.git/)RemoveAllmctl cache cleancommand or automatic TTL-based cleanupgit fetchto update rather than full re-cloneDependencies
None — standalone improvement.
Out of Scope