Skip to content

fix: business_optimizer._simulate_roe silently caps n_simulations at 100 regardless of caller's request #1509

@AlexFiliakov

Description

@AlexFiliakov

Problem Statement

business_optimizer.py _simulate_roe (line 995) uses min(n_simulations, 100) in the simulation loop, silently ignoring any n_simulations value greater than 100. This means all optimization decisions are based on at most 100 Monte Carlo samples, which produces high-variance ROE estimates that can mislead the optimizer.

Note: #1093 documents this as a performance issue. This issue documents the mathematical correctness impact: 100 samples is insufficient for reliable ROE estimation when the optimizer makes retention/limit decisions based on small differences.

v1.0 Impact

High. The optimizer's objective function (ROE) has high variance due to the small sample size. With N=100, the standard error of the mean ROE is approximately std_ROE / 10. For typical ROE noise levels (std ~ 5-15%), the SE is 0.5-1.5 percentage points, which is larger than the differences between candidate insurance strategies. This causes the optimizer to make essentially random decisions.

Affected Code

ergodic_insurance/business_optimizer.py, line 995:

for _ in range(min(n_simulations, 100)):  # Limit for performance
    # ... ROE simulation

The method signature accepts n_simulations: int = 100:

def _simulate_roe(
    self,
    coverage_limit: float,
    deductible: float,
    premium_rate: float,
    time_horizon: int,
    n_simulations: int = 100,  # Default 100, but cap prevents larger values
) -> float:

Current Behavior

_simulate_roe(n_simulations=10000) still runs only 100 simulations. The caller gets a high-variance estimate without any warning that their requested sample size was ignored.

Expected Behavior

Either:

  1. Remove the cap and run the requested number of simulations, OR
  2. If a cap is needed for performance, issue a warning when the requested count is reduced, OR
  3. Use a vectorized (non-loop) implementation that makes N=10000 affordable

Alternative Solutions

  1. Remove the min(n_simulations, 100) cap (simple, may be slow for large N)
  2. Vectorize the simulation loop using NumPy operations (removes the need for a cap)
  3. Use common random numbers (CRN) to reduce variance at the same sample size
  4. Keep the cap but warn the user via warnings.warn()

Recommended Approach

Option 2: vectorize the simulation. The loop body is simple arithmetic on scalars; converting to vectorized NumPy operations would allow N=10000 with negligible performance impact. This eliminates both the correctness issue and the performance concern.

Acceptance Criteria

  • _simulate_roe(n_simulations=N) actually runs N simulations (or warns if capped)
  • Optimizer convergence improves with larger N
  • Unit test verifying that n_simulations parameter is respected

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions