-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Problem Statement
accuracy_validator.py ReferenceImplementations.calculate_var_precise (lines 161-179) computes VaR using np.ceil(confidence * len) - 1 to select an order statistic. The production VaR in risk_metrics.py uses np.percentile(losses, confidence * 100), which applies linear interpolation between adjacent order statistics. These two approaches give different results for the same data and confidence level.
This means the "precise" reference will flag the production VaR as incorrect even when it is working as designed, or conversely, will pass an incorrect implementation that happens to match the ceil convention.
v1.0 Impact
High. The accuracy validator is designed to detect errors in the production code. A VaR reference that uses a different mathematical convention than the production code will produce systematic validation errors or false passes, undermining confidence in the validation framework.
Affected Code
ergodic_insurance/accuracy_validator.py, lines 161-179:
@staticmethod
def calculate_var_precise(losses: np.ndarray, confidence: float) -> float:
sorted_losses = np.sort(losses.astype(np.float64))
index = int(np.ceil(confidence * len(sorted_losses))) - 1 # ceil-based
index = max(0, min(index, len(sorted_losses) - 1))
return float(sorted_losses[index])Production code in risk_metrics.py:
def var(self, confidence: float = 0.95) -> float:
return float(np.percentile(self.losses, confidence * 100)) # linear interpolationCurrent Behavior
For 100 samples, confidence = 0.95:
- Reference: index = ceil(0.95 * 100) - 1 = 94. Returns sorted_losses[94] (the 95th order statistic, no interpolation).
- Production: np.percentile(losses, 95). For 100 samples, returns interpolation between sorted_losses[94] and sorted_losses[95].
For 101 samples, confidence = 0.95:
- Reference: index = ceil(0.95 * 101) - 1 = ceil(95.95) - 1 = 95. Returns sorted_losses[95].
- Production: np.percentile(losses, 95). Returns a linearly interpolated value near sorted_losses[95].
The discrepancy is small but systematic, and can cause validation failures that are false positives.
Expected Behavior
The reference implementation should use the same VaR definition as the production code, or the validation tolerance should explicitly account for the methodological difference.
@staticmethod
def calculate_var_precise(losses: np.ndarray, confidence: float) -> float:
# Match production code's np.percentile with linear interpolation
sorted_losses = np.sort(losses.astype(np.float64))
return float(np.percentile(sorted_losses, confidence * 100))Alternative Solutions
- Match the production code's np.percentile convention in the reference (recommended)
- Use higher-precision interpolation (e.g., Harrell-Davis quantile estimator) as the true reference
- Document the difference and adjust validation tolerances accordingly
Recommended Approach
Option 1: align the reference with the production convention. The purpose of the reference is to validate the production code, so they must agree on the mathematical definition.
Acceptance Criteria
- Reference VaR and production VaR agree to machine precision for the same data
- Validation passes for correctly implemented production VaR
- Unit test with various sample sizes (10, 100, 1000) and confidence levels (0.9, 0.95, 0.99)
References
- Hyndman, R.J. & Fan, Y. (1996). 'Sample quantiles in statistical packages.' American Statistician, 50(4), 361-365. (Defines 9 quantile estimation methods; NumPy uses Type 7 by default.)
- NumPy documentation:
np.percentileuses linear interpolation (method 7) by default.