-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Problem Statement
business_optimizer.py _simulate_roe (line 995) uses min(n_simulations, 100) in the simulation loop, silently ignoring any n_simulations value greater than 100. This means all optimization decisions are based on at most 100 Monte Carlo samples, which produces high-variance ROE estimates that can mislead the optimizer.
Note: #1093 documents this as a performance issue. This issue documents the mathematical correctness impact: 100 samples is insufficient for reliable ROE estimation when the optimizer makes retention/limit decisions based on small differences.
v1.0 Impact
High. The optimizer's objective function (ROE) has high variance due to the small sample size. With N=100, the standard error of the mean ROE is approximately std_ROE / 10. For typical ROE noise levels (std ~ 5-15%), the SE is 0.5-1.5 percentage points, which is larger than the differences between candidate insurance strategies. This causes the optimizer to make essentially random decisions.
Affected Code
ergodic_insurance/business_optimizer.py, line 995:
for _ in range(min(n_simulations, 100)): # Limit for performance
# ... ROE simulationThe method signature accepts n_simulations: int = 100:
def _simulate_roe(
self,
coverage_limit: float,
deductible: float,
premium_rate: float,
time_horizon: int,
n_simulations: int = 100, # Default 100, but cap prevents larger values
) -> float:Current Behavior
_simulate_roe(n_simulations=10000) still runs only 100 simulations. The caller gets a high-variance estimate without any warning that their requested sample size was ignored.
Expected Behavior
Either:
- Remove the cap and run the requested number of simulations, OR
- If a cap is needed for performance, issue a warning when the requested count is reduced, OR
- Use a vectorized (non-loop) implementation that makes N=10000 affordable
Alternative Solutions
- Remove the
min(n_simulations, 100)cap (simple, may be slow for large N) - Vectorize the simulation loop using NumPy operations (removes the need for a cap)
- Use common random numbers (CRN) to reduce variance at the same sample size
- Keep the cap but warn the user via
warnings.warn()
Recommended Approach
Option 2: vectorize the simulation. The loop body is simple arithmetic on scalars; converting to vectorized NumPy operations would allow N=10000 with negligible performance impact. This eliminates both the correctness issue and the performance concern.
Acceptance Criteria
-
_simulate_roe(n_simulations=N)actually runs N simulations (or warns if capped) - Optimizer convergence improves with larger N
- Unit test verifying that n_simulations parameter is respected
References
- Monte Carlo standard error: SE = sigma / sqrt(N). For N=100, SE = sigma/10.
- For optimizer convergence with noisy objectives, see Spall (2003), Introduction to Stochastic Search and Optimization, Chapter 14.
- Related: perf: business_optimizer._simulate_roe Python for-loop over 100 simulations #1093 (performance concern about Python loop), fix: business_optimizer._simulate_roe applies multiplicative Gaussian noise incorrectly producing negative ROE from noise alone #1240 (multiplicative noise model)