-
Notifications
You must be signed in to change notification settings - Fork 57
Description
Hi
I am a new user of PCNtoolkit and currently using BLR + B-spline + WarpSinArcsinh with default parameters to fit normative models. I observed that the numerical scale of the response variables has a strong impact on MSLL, while almost no effect on other evaluation metrics.
The data I used are diffusion metrics (FA, MD, AD, RD) fitted separately. The typical ranges are:
FA: ~10⁻¹
AD: ~10⁻³
MD and RD: ~10⁻⁴
When fitting the models using the original data, the SMSE was slightly below 1, but MSLL was very large. Specifically:
For FA, MSLL was around 1–3, depending on the standardization method (scaler="minmax" or "standardize").
For AD, MD, and RD, MSLL ranged from 7–9+, also depending on the chosen scaling method.
When I linearly scaled all IDPs to roughly 10⁰ (e.g., 0.00017 → 1.7) and refitted the models:
MSLL dropped significantly to around −1 to −2 (for minmax)
RMSE increased proportionally
Other metrics remained nearly unchanged
Based on these observations, I would like to ask:
Is this scale-sensitivity of MSLL expected behavior or a potential issue?
Can the models fitted with the original and scaled data be considered the same model?
Are the results obtained from the scaled data still valid and reasonable?
Environment: Python 3.11.14, PCNtoolkit v1.1.2, WSL Ubuntu 20.04
Thank you very much for your time and assistance. I look forward to your guidance.
Best regards
Below is a comparison of the evaluation metrics between scaled and unscaled data (left: scaled, right: unscaled, minmax).
