Extreme Quantile Regression Neural Networks for insurance pricing.
Your EVT model gives you the 1-in-200 claim for the portfolio. EQRN gives you the 1-in-200 claim for the Kensington flat vs the Somerset farmhouse. That difference is your reinsurance margin.
The standard approach to extreme severity modelling — fit a GPD to all claims above a threshold, read off the 99.5th percentile — pools everything together. It gives you one shape parameter and one scale parameter for the whole book. If your TPBI claims have a heavier tail for younger injured parties and lighter for older ones, the pooled model averages those tails away. Your per-segment VaR is wrong and your XL pricing is wrong.
The solution is covariate-dependent GPD parameters: xi(x) and sigma(x) as functions of risk characteristics, not pooled scalars. This is what EQRN does.
EQRN (Pasche & Engelke 2024, Annals of Applied Statistics) is the first method to estimate covariate-dependent GPD parameters using a neural network. This library is the first Python implementation.
EQRNModel— two-step fitting: LightGBM intermediate quantile + GPD neural networkEQRNDiagnostics— QQ plot, threshold stability, calibration, xi scatter- Out-of-fold intermediate quantile estimation (prevents leakage into GPD step)
- Orthogonal GPD reparameterisation for stable gradient training
predict_quantile— conditional VaR at any extreme level (0.99, 0.995, ...)predict_tvar— conditional TVaR / expected shortfallpredict_exceedance_prob— P(claim > threshold | risk profile)predict_xl_layer— expected loss in per-risk XL layer (attachment, limit)
pip install insurance-eqrnPyTorch is required. For CPU-only:
pip install torch --index-url https://download.pytorch.org/whl/cpu
pip install insurance-eqrnimport numpy as np
from insurance_eqrn import EQRNModel, EQRNDiagnostics
# X: covariate matrix (e.g. risk characteristics)
# y: claim severity values (above basic threshold)
model = EQRNModel(
tau_0=0.85, # intermediate quantile level
hidden_sizes=(32, 16, 8),
n_epochs=300,
shape_fixed=False, # covariate-dependent xi
seed=42,
)
model.fit(X_train, y_train, X_val=X_val, y_val=y_val)
# Per-segment 99.5th percentile severity
var_995 = model.predict_quantile(X_test, q=0.995)
# TVaR for reinsurance pricing
tvar_99 = model.predict_tvar(X_test, q=0.99)
# XL layer: £500k xs £500k
xl_loss = model.predict_xl_layer(X_test, attachment=500_000, limit=500_000)
# Fitted GPD parameters per observation
params = model.predict_params(X_test)
# DataFrame with columns: xi, sigma, nu, thresholdStep 1: Intermediate quantile (LightGBM, out-of-fold)
Fits a quantile regression at level tau_0 (default 0.8) using K-fold cross-validation. Out-of-fold predictions are mandatory here. If you use in-sample predictions, the GPD network in Step 2 sees artificially clean thresholds and learns the wrong exceedance set.
Step 2: GPD neural network on exceedances
Identifies observations above their predicted threshold (~20% of training data at tau_0=0.8). Trains a feedforward network mapping (X, Q_hat(tau_0)) → (nu(x), xi(x)) using the orthogonal GPD deviance loss.
The orthogonal parameterisation (nu = sigma * (xi + 1)) makes the Fisher information matrix diagonal, which stabilises Adam training substantially compared to the direct (sigma, xi) parameterisation.
Prediction
For a new observation x at target level tau > tau_0:
Q_x(tau) = Q_hat_x(tau_0) + sigma(x)/xi(x) * [((1-tau_0)/(1-tau))^xi(x) - 1]
At xi ≈ 0 (exponential limit), this is Q_hat + sigma * log((1-tau_0)/(1-tau)).
| Parameter | Default | Description |
|---|---|---|
| tau_0 | 0.8 | Intermediate quantile level. Increase for smaller datasets |
| hidden_sizes | (32, 16, 8) | Network hidden layer widths |
| n_epochs | 500 | Maximum training epochs |
| patience | 50 | Early stopping patience |
| shape_fixed | False | If True, xi is a scalar. Start here before fitting full model |
| l2_pen | 1e-4 | L2 weight decay |
| shape_penalty | 0 | Penalty on variance of xi(x) — smooths the shape surface |
| p_drop | 0 | Dropout probability. Try 0.1–0.2 for small datasets |
| n_folds | 5 | K-fold folds for OOF intermediate quantile |
| seed | None | Random seed |
from insurance_eqrn import EQRNDiagnostics
diag = EQRNDiagnostics(model)
# GPD QQ plot — should track the diagonal if the tail model is correct
diag.qq_plot(X_test, y_test)
# Predicted vs empirical coverage at each quantile level
diag.calibration_plot(X_test, y_test, levels=[0.9, 0.95, 0.99, 0.995])
# Mean residual life plot — linearity onset shows where GPD approximation holds
diag.mean_residual_life_plot(y_train)
# Threshold stability — fit shape_fixed models at each tau_0, look for plateau
diag.threshold_stability_plot(X_train, y_train)
# Summary table: predicted vs empirical exceedance rates
diag.summary_table(X_test, y_test)Motor TPBI (Third-Party Bodily Injury)
Young injured parties have longer annuity streams and heavier tails. EQRN lets you model xi(x) as a function of injured party age, claim type, solicitor involvement. Output: P(claim > £500k | risk profile) per policy.
Property large loss
Commercial property fire severity varies by construction class, sum insured, sprinkler status. EQRN provides 1-in-200 loss conditional on risk characteristics — input to CAT reinsurance models.
Per-risk XL pricing
# Price layer: £1M xs £500k, conditional on risk
xl = model.predict_xl_layer(X_test, attachment=500_000, limit=1_000_000)Solvency II SCR
EQRN provides per-segment 99.5th percentile severity, which is the correct input for simulation-based SCR calculations on heterogeneous portfolios. Segment-level conditional VaR is more conservative than pooled EVT for high-risk segments and more accurate for low-risk segments.
- Frequency modelling: EQRN models severity above a threshold. Frequency is a separate model.
- Attritional claims: Claims below tau_0 are not modelled by EQRN.
- Small books (n_exceedances < 200): Set shape_fixed=True as a minimum. Below ~100 exceedances, fall back to marginal EVT.
- No covariates: Use insurance-evt directly.
No formal benchmark against a fixed public dataset yet. The relevant comparison is with marginal EVT (single pooled GPD) on the same data. Pasche & Engelke (2024) show that EQRN produces better-calibrated extreme quantiles than marginal EVT when covariate effects on the tail are present (e.g., younger injured parties have heavier tails in TPBI). On simulated data with known covariate-dependent shape parameter xi(x), EQRN with shape_fixed=False recovers the true xi(x) surface; pooled GPD produces a single xi that averages across the variation. The practical question is always whether your book has enough heterogeneity in tail shape to justify the extra complexity. Use diag.threshold_stability_plot() and compare shape_fixed=True vs shape_fixed=False calibration plots — if the covariate-dependent model doesn't improve calibration, use the simpler marginal EVT approach. Below 200 tail observations, the covariate-dependent model will overfit regardless of regularisation.
Pasche, O.C. & Engelke, S. (2024). "Neural networks for extreme quantile regression with an application to forecasting of flood risk." Annals of Applied Statistics, 18(4), 2818–2839. DOI:10.1214/24-AOAS1907.
R reference implementation: opasche/EQRN (CRAN, March 2025).