Skip to content

feat: add configurable on-cache-hit hooks to gcache#121

Open
lan17 wants to merge 7 commits intomainfrom
codex/gcache-on-cache-hit-hooks
Open

feat: add configurable on-cache-hit hooks to gcache#121
lan17 wants to merge 7 commits intomainfrom
codex/gcache-on-cache-hit-hooks

Conversation

@lan17
Copy link
Collaborator

@lan17 lan17 commented Mar 9, 2026

Summary

This PR adds a generic cache-hit hook API to gcache so callers can validate a cached value before it is returned. It introduces a global default hook on GCacheConfig, while still allowing each @cached(...) use case to override that hook or disable it explicitly.

The initial API is intentionally small: hooks can either return the cached value unchanged or evict the current layer and fall back. There is no repair or replacement path in this version.

What Changed

  • add public cache-hit hook types and export them from the package API
  • add global on_cache_hit support to GCacheConfig
  • add decorator-level override and explicit disable semantics in GCache.cached(...)
  • run hook decisions on true cache hits in both local and Redis-backed cache layers
  • make EvictAndFallback reuse the normal miss/write-back path so evicted layers are rewarmed immediately
  • skip hook execution for Redis entries that have already been invalidated by watermark
  • add hook action and error metrics
  • add tests for global inheritance, decorator override, decorator disable, local eviction fallback, remote eviction fallback, async hooks, invalid hook decisions, and invalidation-plus-hook behavior

Why

Galileo needs a framework-agnostic extension point for rejecting unsafe cached ORM values, especially detached-and-expired SQLAlchemy objects held in local cache by reference. This PR provides that extension point without making gcache aware of SQLAlchemy.

The final behavior is intentionally narrow:

  • hooks see only the cache key and layer for a usable cache hit
  • invalidated Redis entries are refreshed before any hook runs
  • eviction paths behave like normal misses so cache layers rewarm immediately
  • detached lazy-load failures remain the responsibility of the callers hook policy, not gcache itself

Testing

  • poetry run pytest tests/
  • poetry run ruff check src tests
  • poetry run mypy src
  • poetry run mypy --package tests --namespace-packages

Related Work

@codecov
Copy link

codecov bot commented Mar 9, 2026

Codecov Report

❌ Patch coverage is 96.26168% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.21%. Comparing base (eb3fa9f) to head (4ad1ad3).

Files with missing lines Patch % Lines
src/gcache/_internal/redis_cache.py 88.88% 2 Missing ⚠️
src/gcache/_internal/local_cache.py 94.44% 1 Missing ⚠️
src/gcache/gcache.py 92.30% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #121      +/-   ##
==========================================
- Coverage   97.47%   97.21%   -0.26%     
==========================================
  Files          13       14       +1     
  Lines         673      754      +81     
==========================================
+ Hits          656      733      +77     
- Misses         17       21       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a configurable cache-hit hook API to gcache, allowing callers to validate cached values on true hits and either return the cached value or evict the current layer and fall back (with metrics and tests covering the behavior).

Changes:

  • Add public cache-hit hook context/decision types (CacheCallContext, ReturnCached, EvictAndFallback) and export them from the package.
  • Add global on_cache_hit support to GCacheConfig, plus decorator-level override/disable semantics in GCache.cached(...).
  • Execute hooks on true hits in both Local and Redis cache layers, with hook action/error metrics and expanded test coverage.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/test_gcache.py Adds tests for global inheritance, decorator override/disable, eviction fallback behavior, async hooks, invalid decisions, and invalidation interactions.
src/gcache/gcache.py Adds decorator parameter plumbing and resolution logic for global vs decorator-level hook behavior.
src/gcache/config.py Defines hook context/decision types and adds on_cache_hit to GCacheConfig.
src/gcache/_internal/cache_interface.py Extends the cache interface get() contract to accept on_cache_hit as a kw-only parameter.
src/gcache/_internal/wrappers.py Propagates on_cache_hit through CacheController and CacheChain.
src/gcache/_internal/local_cache.py Runs hook logic on local cache hits and supports evict+fallback.
src/gcache/_internal/redis_cache.py Runs hook logic on Redis hits, supports evict+fallback, and skips hook execution for invalidated entries.
src/gcache/_internal/noop_cache.py Updates get() signature to match the new interface and ignore on_cache_hit.
src/gcache/_internal/cache_hit.py New helper to execute hooks (sync/async), normalize decisions, and emit hook metrics.
src/gcache/_internal/metrics.py Adds Prometheus counters for hook actions and hook errors.
src/gcache/init.py Exports the new hook-related public API symbols.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

track_for_invalidation: bool = False,
default_config: GCacheKeyConfig | None = None,
serializer: Serializer | None = None,
on_cache_hit: CacheHitHook | Literal[False] | _UseGlobalOnCacheHit = _USE_GLOBAL_ON_CACHE_HIT,
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GCache.cached() is a public API, but the on_cache_hit parameter type includes the private _UseGlobalOnCacheHit sentinel. This leaks an internal implementation detail into type hints/docs and makes the signature harder for callers to understand.

Consider changing the public annotation/default to something like on_cache_hit: CacheHitHook | Literal[False] | None = None (where None means “inherit from config”) and keep any sentinel strictly internal (or use overloads) so the public type surface doesn’t mention private classes.

Suggested change
on_cache_hit: CacheHitHook | Literal[False] | _UseGlobalOnCacheHit = _USE_GLOBAL_ON_CACHE_HIT,
on_cache_hit: CacheHitHook | Literal[False] | None = _USE_GLOBAL_ON_CACHE_HIT,

Copilot uses AI. Check for mistakes.
if isinstance(decision, BypassCurrentLayer):
return await fallback()

return cached_value
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run_cache_hit_hook() only returns ReturnCached, EvictAndFallback, or BypassCurrentLayer, so the final return cached_value after handling those cases is unreachable. Leaving dead code here makes the control flow look incomplete and can hide future logic errors.

Consider removing the final branch or replacing it with an explicit assert_never(decision)/raise AssertionError to enforce exhaustiveness if new decision types are added later.

Suggested change
return cached_value
raise AssertionError(f"Unexpected cache hit decision: {decision!r}")

Copilot uses AI. Check for mistakes.
Comment on lines +195 to +196
if not isinstance(decision, ReturnCached):
return await fallback()
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After run_cache_hit_hook(), the only possible decision types are ReturnCached, EvictAndFallback, or BypassCurrentLayer. Since the first two are handled explicitly above, the if not isinstance(decision, ReturnCached): return await fallback() branch is redundant/unreachable.

Consider removing that check (or converting it into an assertion) to keep the decision handling exhaustive and easier to reason about.

Suggested change
if not isinstance(decision, ReturnCached):
return await fallback()
assert isinstance(decision, ReturnCached)

Copilot uses AI. Check for mistakes.
Comment on lines +47 to +48
def get_local_cache_for_use_case(gcache: GCache, use_case: str) -> LocalCache:
return cast(LocalCache, gcache._local_cache.wrapped)
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_local_cache_for_use_case() takes a use_case argument but never uses it (it always returns gcache._local_cache.wrapped). This is misleading for readers and makes future refactors riskier.

Either drop the unused parameter or use it to validate/access the expected per-use-case cache entry (e.g., assert the use_case cache exists before returning).

Copilot uses AI. Check for mistakes.
Comment on lines +819 to +821
await gcache.ainvalidate("Test", "123")
await asyncio.sleep(0.01)

Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The await asyncio.sleep(0.01) makes this test timing-dependent and potentially flaky (it’s implicitly relying on the invalidation watermark being strictly in the past before the subsequent fetch can be cached).

Prefer making this deterministic by avoiding a fixed sleep. For example, adjust the Redis fallback caching condition so watermark_ms == now_ms is treated as cacheable (so no wait is needed), or explicitly set a small future_buffer_ms and then wait/poll until the watermark is observed to be in the past before asserting hook call counts.

Copilot uses AI. Check for mistakes.
Comment on lines +69 to +76
if isinstance(decision, EvictAndFallback):
GCacheMetrics.HIT_HOOK_ACTION_COUNTER.labels(
key.use_case,
key.key_type,
layer.name,
"evict",
decision.reason or "none",
).inc()
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EvictAndFallback.reason is used directly as a Prometheus label value (reason=decision.reason). Since this is a public API and callers can supply arbitrary strings, it creates a real risk of unbounded label cardinality (e.g., if someone includes IDs/details in the reason), which can destabilize metrics storage.

Consider constraining reason to a small enum-like set (or validating/sanitizing it before labeling, e.g., mapping unknown/long values to a fixed bucket) to keep cardinality bounded.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants