-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
🚀 Comparison of Old vs. New CM & PGM Pipeline Outputs
Description:
This issue tracks the comparison of old vs. new CM (Country-Month) and PGM (Prior-Grid-Month) forecasts to identify any discrepancies, misalignments, or significant changes. The comparison should be done month by month across the two pipelines, and researchers should consider implications for future output drift detection.
Tasks & Milestones:
- Ensure all participants can reliably download the latest predictions via API
- Check in with Jim before the comparison, ensuring reconciled PGM forecasts are properly accounted for
- Meeting scheduled for Feb 13 with Angelica, Borbála, Sonja, Simon if possible
- Compare CM & PGM forecasts from the old pipeline to those from the new pipeline on a month-by-month basis
- Start with ensemble models → If they are very similar, further analysis may not be needed. If they are rather different, comparisons should be done for each individual model
- Use appropriate tests to quantify differences (see suggestions below)
- Document key differences and potential reasons for variations
- Reflect on what worked well for future output drift detection development
- Summarize findings and create a follow-up plan for necessary adjustments if needed
Suggested Comparison Methods:
The choice of method is left to researcher expertise and creativity, but the following are suggested:
-
Statistical Tests for Similarity & Divergence:
- Correlation tests: Pearson, Spearman, Kendall
- Divergence measures: Jensen-Shannon divergence, Kullback–Leibler divergence, Wasserstein distance
- Cosine similarity
- MSE/MSA/LMSE
-
Qualitative Sanity Checks:
- Mapping forecast outputs
- Mapping test outputs
Deliverables:
- A small documented summary of discrepancies & changes
- Recommendations for handling potential misalignments
- Considerations for future output drift detection development
Deadline:
📅 Feb 24
Reactions are currently unavailable
Metadata
Metadata
Labels
No labels