-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Note
Polyfill Status: The .NET runtime is adding native cache metrics in .NET 11 via dotnet/runtime#124140 (API-approved, milestone 11.0.0). This library serves as a polyfill for .NET 8/9/10 applications until .NET 11 reaches GA in November 2026. The API surface intentionally mirrors the approved runtime design (meter name, instrument names, tag schema) to ensure a smooth migration path.
Items below marked strikethrough were not approved by the .NET API review board (bartonjs review). Items marked with ✅ are implemented in the current codebase.
Introduction/Overview
The Memory Cache Metrics API with Eviction Tracking addresses critical performance issues in containerized environments where memory pressure can cause cache thrashing, leading to degraded application performance. This feature modernizes cache usage patterns by providing visibility into cache behavior and enabling proactive memory pressure handling.
The primary goal is to prevent performance degradation by tracking cache evictions, providing comprehensive metrics, and supporting both global default caches and component-specific caches in a backward-compatible manner.
Goals
- ✅ Provide eviction visibility: Track and report cache eviction counts with ~100ns acceptable overhead per operation
- ✅ Enable proactive monitoring: Support comprehensive cache metrics (hits, misses, eviction reasons, memory usage)
- ✅ Support modern deployment patterns: Design for container-friendly memory management with dynamic sizing capabilities
- ✅ Maintain backward compatibility: Ensure existing applications using
MemoryCacheStatisticscontinue to work without modification - ✅ Enable external integration: Support export to monitoring systems (Prometheus, Application Insights, OpenTelemetry)
- ✅ Provide flexible registration: Support both automatic DI-based registration and explicit component-specific cache registration
Non-Goals (Out of Scope)
- Automatic cache size adjustment based on memory pressure (future enhancement)
- Custom eviction policy implementation
- Cache data persistence or recovery mechanisms
- Real-time alerting or notification systems
- Performance optimization beyond the ~100ns overhead target
- Migration tools for existing custom monitoring solutions
- Distributed, nested, or hierarchical cache; use HybridCache
- Recommendations for cache size limits based on container memory constraints
User Stories
- ✅ As a service developer, I want to track eviction counts in my application's default memory cache so that I can identify when memory pressure is causing performance issues.
- ✅ As a library author, I want to register my component's cache with a metrics system so that service owners can monitor my library's cache behavior alongside their application caches.
- ✅ As a DevOps engineer, I want to export cache metrics to Prometheus so that I can create dashboards and alerts for cache performance in containerized environments.
- ✅ As a performance engineer, I want to distinguish between different eviction reasons (memory pressure vs expiration) so that I can identify true performance problems versus normal cache operation.
As an application architect, I want to configure cache metrics collection with different sampling rates so that I can balance monitoring granularity with performance overhead.— Not approved. Observable instruments have zero hot-path overhead, making sampling unnecessary.
Functional Requirements
- ✅ The system must extend
MemoryCacheStatisticsto includeTotalEvictedEntriesproperty without breaking existing applications. Implemented via customCacheStatisticsclass; the BCLMemoryCacheStatistics.TotalEvictionsproperty is approved for .NET 11 in dotnet/runtime#124140. The system must provide a— Not approved. The .NET API review rejected theMemoryCacheMetricsservice that can register and track multiple named caches through dependency injection.IMemoryCacheMetricsregistry pattern as "non-intuitive" in favor of native metrics onMemoryCacheviaIMeterFactory.- ✅ The system must support eviction tracking with configurable overhead allowing sampling rates from real-time to periodic (5-30 seconds). Implemented via observable instruments (polled, zero hot-path overhead).
- ✅ The system must distinguish between eviction reasons including memory pressure, expiration, and manual removal. Implemented:
EvictionReason.RemovedandEvictionReason.Replacedare excluded from eviction counts. The system must prevent duplicate cache registration by maintaining weak references to registered cache instances.— Not approved. Part of the rejectedIMemoryCacheMetricsregistry pattern.The system must handle naming conflicts by either throwing exceptions for duplicates or using automatic resolution strategies.— Not approved. Part of the rejectedIMemoryCacheMetricsregistry pattern.- ✅ The system must integrate with OpenTelemetry/IMeterFactory to enable export to external monitoring systems. Implemented with 4 observable instruments:
cache.requests,cache.evictions,cache.entries,cache.estimated_size. - ✅ The system must provide extension methods for easy service registration in ASP.NET Core applications. Implemented:
AddNamedMeteredMemoryCacheandDecorateMemoryCacheWithMetrics. The system must implement circuit breaker functionality to reduce metrics collection if overhead exceeds configurable thresholds.— Not approved. Observable instruments are polled (not pushed), so hot-path overhead is already near-zero; circuit breakers are unnecessary with this architecture.- ✅ The system must support opt-in statistics tracking.
Automatic discovery of DI-registered cachesis not approved (part of the rejected registry pattern). Opt-in tracking implemented viaMeteredMemoryCacheOptionsand DI extension methods. - ✅ The system must provide comprehensive metrics including hit/miss ratios, cache size, item lifecycle data, and operation latency. Implemented: hits, misses, evictions, entry count, estimated size, and calculated hit ratio via
CacheStatistics. The system must use weak references to prevent memory leaks when clients forget to unregister caches.— Not approved. Part of the rejectedIMemoryCacheMetricsregistry pattern.
Design Considerations
- ✅ API Surface: Custom
CacheStatisticsclass withTotalEvictions(polyfills the approved BCLMemoryCacheStatistics.TotalEvictionsproperty) Registration Pattern: Use explicit registration for component-specific caches with potential future automatic discovery— Rejected by .NET API review- ✅ Configuration Tiers: Provide no-config defaults, simple predefined profiles, and advanced fine-grained control
Memory Safety: Implement weak reference patterns to prevent memory leaks from unregistered caches— Part of rejected registry pattern- ✅ Performance: Design for minimal overhead with observable instruments (zero hot-path allocation)
Technical Considerations
- ✅ Integration with existing DI container: Leverage
IMeterFactoryfor OpenTelemetry compatibility - ✅ Thread safety: Ensure metrics collection is thread-safe for high-concurrency scenarios (implemented via
Interlockedatomics) Weak reference management: Implement proper cleanup of disposed cache references— Part of rejected registry patternSampling strategies: Support configurable sampling rates to balance accuracy with performance— Not approved; unnecessary with observable instruments- ✅ Export mechanisms: Design pluggable exporters for different monitoring backends (implemented via standard OpenTelemetry pipeline)
Success Metrics
- ✅ Performance overhead: Maintain <100ns per cache operation when metrics are enabled — Validated via BenchmarkDotNet suites (CacheBenchmarks, MetricsOverheadBenchmarks, ContentionBenchmarks)
- ✅ Adoption rate: Achieve integration in existing applications without requiring code changes (for basic scenarios)
- ✅ Diagnostic value: Enable identification of cache thrashing patterns that were previously invisible
- Container efficiency: Reduce memory-related performance issues in containerized deployments by 20% — Not yet measured
- ✅ Monitoring integration: Support export to at least 3 major monitoring platforms (Prometheus, Application Insights, Data Dog) — Supported via standard OpenTelemetry exporter pipeline
Open Questions
- Automatic cache sizing documentation: (future enhancement) Provide recommendations for cache configuration based on container memory constraints
Metric retention: How long should in-memory metrics be retained before aggregation/export, and should this be configurable?— Resolved: metrics use observable instruments polled by the OTel SDK; retention is controlled by the configured exporter, not by this library.Performance testing scope: What specific performance benchmarks should be established to validate the <100ns overhead target across different cache usage patterns?— Resolved: three benchmark suites implemented —CacheBenchmarks(operation overhead),MetricsOverheadBenchmarks(instrumentation cost),ContentionBenchmarks(concurrent contention).- Migration documentation: What level of detail is needed in migration guides for applications migrating from this polyfill to native .NET 11
MemoryCachemetrics?
Parent Tasks for Memory Cache Metrics API with Eviction Tracking
1. ✅ Extend MemoryCacheStatistics with Eviction Tracking
- Enhance Microsoft's
MemoryCacheStatisticsto includeTotalEvictedEntriesproperty while maintaining backward compatibility. Implemented via customCacheStatisticsclass inMeteredMemoryCache.
Sub-tasks:
- Create
CacheStatisticsclass withTotalEvictionsproperty (polyfills BCLMemoryCacheStatistics.TotalEvictions) - Implement backward-compatible statistics via
MeteredMemoryCache.GetCurrentStatistics() - Add thread-safe eviction counting mechanism (via
Interlockedatomics) - Create mapping between
PostEvictionReasonand eviction categories (excludesRemoved/Replaced) - Add validation and error handling for statistics collection
- Write comprehensive unit tests for statistics extensions
- Add integration tests with existing cache implementations
- Create performance benchmarks to validate <100ns overhead target
- Update API documentation and usage examples
2. Create MemoryCacheMetrics Service Infrastructure
Develop a centralized service for registering and tracking multiple named caches with weak reference management and conflict resolution— Not approved by .NET API review. The registry pattern (IMemoryCacheMetrics) was rejected as "non-intuitive." The approved approach uses nativeIMeterFactoryonMemoryCachedirectly.
Sub-tasks:
-
DesignIMemoryCacheMetricsinterface with registration and tracking methods -
ImplementMemoryCacheMetricsservice with weak reference management -
Create cache registration system with naming conflict resolution -
Implement automatic cleanup of disposed cache references -
Add thread-safe concurrent access patterns for multi-cache scenarios -
Create cache discovery mechanism for DI-registered caches -
Implement metrics aggregation across multiple named caches -
Add cache lifecycle management (registration, tracking, cleanup) -
Write comprehensive unit tests for service functionality -
Create integration tests for multi-cache scenarios -
Add service registration extensions for dependency injection -
Document service usage patterns and best practices
3. Implement Circuit Breaker and Sampling Mechanisms
- ~~Add configurable overhead protection with circuit breaker functionality and sampling rates to maintain the ~100ns performance target~~ — Not approved. Observable instruments are polled (not pushed), so hot-path overhead is already near-zero. Circuit breakers and sampling are unnecessary with this architecture.
Sub-tasks:
-
Design circuit breaker threshold configuration -
Implement sampling rate controls -
Add overhead monitoring and adaptive throttling -
Write tests for circuit breaker behavior -
Document sampling configuration options
4. Add Advanced Cache Management Features
Implement component-specific cache registration, automatic discovery of DI-registered caches, and enhanced export mechanisms— Not approved. Component-specific registration and automatic discovery are part of the rejectedIMemoryCacheMetricsregistry pattern.
Sub-tasks:
-
Implement component-specific cache registration -
Add automatic discovery of DI-registered caches -
Create enhanced export mechanisms for cache metrics -
Write integration tests for advanced scenarios -
Document advanced cache management patterns
5. Integrate OpenTelemetry and External Monitoring
- Enhance existing OpenTelemetry integration with support for multiple monitoring backends (Prometheus, Application Insights) and comprehensive metrics export
Sub-tasks:
- Enhance existing OpenTelemetry integration with new metrics
- Create Prometheus metrics exporter with proper label handling
- Implement Application Insights integration with custom metrics
- Add support for Grafana dashboard configuration
- Create pluggable exporter architecture for extensibility (via standard OTel pipeline)
- Create configuration system for multiple export destinations
- Write integration tests for all supported monitoring backends
- Create example configurations for popular monitoring setups
- Document monitoring setup and troubleshooting guides