fix: accuracy_validator.calculate_var_precise uses ceil-based index inconsistent with np.percentile in production code

## Problem Statement

`accuracy_validator.py` `ReferenceImplementations.calculate_var_precise` (lines 161-179) computes VaR using `np.ceil(confidence * len) - 1` to select an order statistic. The production VaR in `risk_metrics.py` uses `np.percentile(losses, confidence * 100)`, which applies linear interpolation between adjacent order statistics. These two approaches give **different results** for the same data and confidence level.

This means the "precise" reference will flag the production VaR as incorrect even when it is working as designed, or conversely, will pass an incorrect implementation that happens to match the ceil convention.

## v1.0 Impact

**High.** The accuracy validator is designed to detect errors in the production code. A VaR reference that uses a different mathematical convention than the production code will produce systematic validation errors or false passes, undermining confidence in the validation framework.

## Affected Code

`ergodic_insurance/accuracy_validator.py`, lines 161-179:

```python
@staticmethod
def calculate_var_precise(losses: np.ndarray, confidence: float) -> float:
    sorted_losses = np.sort(losses.astype(np.float64))
    index = int(np.ceil(confidence * len(sorted_losses))) - 1  # ceil-based
    index = max(0, min(index, len(sorted_losses) - 1))
    return float(sorted_losses[index])
```

Production code in `risk_metrics.py`:

```python
def var(self, confidence: float = 0.95) -> float:
    return float(np.percentile(self.losses, confidence * 100))  # linear interpolation
```

## Current Behavior

For 100 samples, confidence = 0.95:
- Reference: index = ceil(0.95 * 100) - 1 = 94. Returns sorted_losses[94] (the 95th order statistic, no interpolation).
- Production: np.percentile(losses, 95). For 100 samples, returns interpolation between sorted_losses[94] and sorted_losses[95].

For 101 samples, confidence = 0.95:
- Reference: index = ceil(0.95 * 101) - 1 = ceil(95.95) - 1 = 95. Returns sorted_losses[95].
- Production: np.percentile(losses, 95). Returns a linearly interpolated value near sorted_losses[95].

The discrepancy is small but systematic, and can cause validation failures that are false positives.

## Expected Behavior

The reference implementation should use the same VaR definition as the production code, or the validation tolerance should explicitly account for the methodological difference.

```python
@staticmethod
def calculate_var_precise(losses: np.ndarray, confidence: float) -> float:
    # Match production code's np.percentile with linear interpolation
    sorted_losses = np.sort(losses.astype(np.float64))
    return float(np.percentile(sorted_losses, confidence * 100))
```

## Alternative Solutions

1. Match the production code's np.percentile convention in the reference (recommended)
2. Use higher-precision interpolation (e.g., Harrell-Davis quantile estimator) as the true reference
3. Document the difference and adjust validation tolerances accordingly

## Recommended Approach

Option 1: align the reference with the production convention. The purpose of the reference is to validate the production code, so they must agree on the mathematical definition.

## Acceptance Criteria

- [ ] Reference VaR and production VaR agree to machine precision for the same data
- [ ] Validation passes for correctly implemented production VaR
- [ ] Unit test with various sample sizes (10, 100, 1000) and confidence levels (0.9, 0.95, 0.99)

## References

- Hyndman, R.J. & Fan, Y. (1996). 'Sample quantiles in statistical packages.' American Statistician, 50(4), 361-365. (Defines 9 quantile estimation methods; NumPy uses Type 7 by default.)
- NumPy documentation: `np.percentile` uses linear interpolation (method 7) by default.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: accuracy_validator.calculate_var_precise uses ceil-based index inconsistent with np.percentile in production code #1504

Problem Statement

v1.0 Impact

Affected Code

Current Behavior

Expected Behavior

Alternative Solutions

Recommended Approach

Acceptance Criteria

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

fix: accuracy_validator.calculate_var_precise uses ceil-based index inconsistent with np.percentile in production code #1504

Description

Problem Statement

v1.0 Impact

Affected Code

Current Behavior

Expected Behavior

Alternative Solutions

Recommended Approach

Acceptance Criteria

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions