|
1 | 1 | # Changelog |
2 | 2 |
|
3 | | -## Version 0.1.6a0 (2024-12-11) |
| 3 | +## Version 0.1.6a3 (2025-12-14) |
4 | 4 |
|
5 | | -**Alpha Release: CAP Curves, Styled Display & Enhanced Metrics** 📊 |
6 | | - |
7 | | -This alpha release introduces powerful visualization and display capabilities for model performance analysis and WOE interpretation. |
| 5 | +**Alpha Release: CAP Curves, Styled Display, MSD Feature Selection & Enhanced Metrics** 📊 |
8 | 6 |
|
9 | 7 | ### ✨ New Features |
10 | | - |
11 | | -#### 1. CAP Curve Visualization (`plot_performance`) |
12 | | -- **Unified CAP Curves**: Single `plot_performance()` function for both binary (PD) and continuous (LGD) targets |
13 | | -- **Multiple Model Support**: Plot and compare multiple models on the same chart |
14 | | - - Pass list of predictions: `y_pred=[model1, model2, model3]` |
15 | | - - Custom labels and colors: `labels=['Model A', 'Model B']`, `colors=['#69db7c', '#55d3ed']` |
16 | | -- **Flexible Layout**: Accept external `matplotlib.Axes` for custom subplot grids |
17 | | -- **Crystal Ball Line**: Perfect ranking baseline (blue) with "Crystal Ball" legend |
18 | | -- **Professional Styling**: Arial font, consistent linewidths, dotted grid, clean aesthetics |
19 | | -- **Weighted Gini Support**: Pass `weights` parameter for EAD-weighted calculations (LGD models) |
20 | | -- **Returns**: `(fig, ax, gini)` tuple where `gini` is single value or list for multiple models |
21 | | - |
22 | | -#### 2. Weighted Somers' D (`fast_somersd`) |
23 | | -- **Numba-Optimized Weighted Implementation**: New `_somers_yx_weighted()` function |
24 | | - - O(n²) weighted concordant/discordant pair calculation |
25 | | - - Fully integrated into `somersd_yx()` function |
26 | | - - No external dependencies (removed sklearn.roc_auc_score fallback) |
27 | | -- **Unified API**: `somersd_yx(y, x, weights=None)` handles both weighted and unweighted cases |
28 | | -- **Regulatory Compliance**: Supports EAD-weighted Gini for Basel/regulatory LGD models |
29 | | -- **Performance**: Numba JIT compilation for efficient weighted calculations |
30 | | - |
31 | | -#### 3. Rich HTML Display (`fastwoe.display`) |
32 | | -- **Styled DataFrames**: Beautiful HTML tables for Jupyter notebooks |
33 | | - - Clean baseline foundation design (light mode) |
34 | | - - Inter font family, subtle gradients, alternating rows |
35 | | - - Gradient highlighting for numeric columns (high/medium/low values) |
36 | | - - Significance badges for statistical tests |
37 | | -- **Decorator-Based Styling**: Clean, reusable code patterns |
38 | | - - `@iv_styled`: Automatic IV analysis styling |
39 | | - - `@styled(title, subtitle, highlight_cols, precision)`: Custom DataFrame styling |
40 | | - - Functions return styled output seamlessly |
41 | | -- **Pre-configured Functions**: |
42 | | - - `style_iv_analysis(df)`: IV analysis with feature importance highlighting |
43 | | - - `style_woe_mapping(df, feature_name)`: WOE transformations with category details |
44 | | - - `StyledDataFrame(df, ...)`: Direct wrapper for any DataFrame |
45 | | -- **Professional Design**: Based on baseline foundation design system |
46 | | - - Consistent light mode colors (#FCFCFC, #F0F0F0, #E8E8E8) |
47 | | - - 16px border radius for badges |
48 | | - - Smooth transitions with cubic-bezier(0.32, 0.72, 0, 1) |
49 | | - - Clean typography hierarchy |
50 | | - |
51 | | -#### 4. WOE Visualization (`visualize_woe`) |
52 | | -- **Dual Display Modes**: |
53 | | - - `mode="probability"`: Show default probability deltas |
54 | | - - `mode="log_odds"`: Show log-odds (WOE values) |
55 | | -- **Horizontal Bar Charts**: Clear visualization of WOE impact per category |
56 | | -- **Color Coding**: Positive (risk-increasing) vs negative (risk-decreasing) categories |
57 | | -- **Baseline Reference**: Shows prior probability/log-odds as reference point |
| 8 | +- **CAP Curve Visualization** (`plot_performance`): Unified function for binary (PD) and continuous (LGD) targets with multi-model support |
| 9 | +- **Weighted Somers' D**: Numba-optimized weighted implementation for EAD-weighted Gini calculations |
| 10 | +- **Rich HTML Display** (`fastwoe.display`): Styled DataFrames for Jupyter with decorator-based styling (`@iv_styled`, `@styled`) |
| 11 | +- **WOE Visualization** (`visualize_woe`): Horizontal bar charts showing WOE impact per category |
| 12 | +- **Marginal Somers' D Feature Selection** (`marginal_somersd_selection`): Residual-based forward selection using rank correlation, works with both binary and continuous targets |
| 13 | +- **Somers' D Shapley Values** (`somersd_shapley`): Shapley value decomposition for feature contribution analysis |
58 | 14 |
|
59 | 15 | ### 🔧 API Changes |
| 16 | +- New: `plot_performance()`, `visualize_woe()`, `StyledDataFrame()`, `style_iv_analysis()`, `style_woe_mapping()` |
| 17 | +- New: `marginal_somersd_selection()` in `fastwoe.screening` (renamed from `fastwoe.modeling`) |
| 18 | +- New: `somersd_shapley()` for Shapley value decomposition |
| 19 | +- Enhanced: `somersd_yx(y, x, weights=None)` now supports weighted calculations |
| 20 | +- Changed: `somersd_clustered_matrix()` now binary-only (raises `ValueError` for non-binary labels) |
60 | 21 |
|
61 | | -#### New Functions |
62 | | -- `plot_performance(y_true, y_pred, weights=None, ax=None, labels=None, colors=None, figsize=(6,5), dpi=100, show_plot=True)` |
63 | | -- `visualize_woe(woe_encoder, feature_name, mode='probability', figsize=(10, None), color_positive='#F783AC', color_negative='#A4D8FF', show_plot=True)` |
64 | | -- `styled(title, subtitle, highlight_cols, precision)` - Decorator |
65 | | -- `iv_styled` - Decorator for IV analysis |
66 | | -- `style_iv_analysis(df)` - Function-based styling |
67 | | -- `style_woe_mapping(df, feature_name)` - Function-based styling |
68 | | -- `StyledDataFrame(df, title, subtitle, highlight_cols, precision)` - Direct wrapper |
69 | | - |
70 | | -#### Enhanced Functions |
71 | | -- `somersd_yx(y, x, weights=None)`: Now accepts optional `weights` parameter for weighted Somers' D calculation |
72 | | - |
73 | | -#### Exports |
74 | | -Updated `fastwoe/__init__.py` to export: |
75 | | -- `plot_performance`, `visualize_woe` from `metrics` |
76 | | -- `StyledDataFrame`, `style_iv_analysis`, `style_woe_mapping`, `styled`, `iv_styled` from `display` |
77 | | - |
78 | | -### 📊 Examples & Documentation |
79 | | - |
80 | | -#### New Notebooks |
81 | | -- **`examples/fastwoe_cap_curve.ipynb`**: Comprehensive CAP curve demonstrations |
82 | | - - Single model CAP curves (binary PD) |
83 | | - - Multiple model comparison with custom colors |
84 | | - - EAD-weighted Gini for LGD models |
85 | | - - Side-by-side unweighted vs weighted comparisons |
86 | | - - Continuous target (LGD) examples |
87 | | - |
88 | | -- **`examples/fastwoe_styled_display.ipynb`**: Rich HTML display demonstrations |
89 | | - - Decorator-based styling patterns |
90 | | - - IV analysis with `@iv_styled` |
91 | | - - Custom styled tables with `@styled` |
92 | | - - Feature importance rankings |
93 | | - - Model comparison tables |
94 | | - - Risk segmentation analysis |
95 | | - |
96 | | -#### New Documentation |
97 | | -- **`WEIGHTED_SOMERSD_SUMMARY.md`**: Mathematical foundation and implementation details for weighted Somers' D |
98 | | - |
99 | | -### 🐛 Bug Fixes |
100 | | -- **Gini Calculation**: Corrected relationship between Somers' D and Gini (removed incorrect `2 *` multiplier for binary targets) |
101 | | -- **Weighted Gini**: Removed sklearn.roc_auc_score dependency, implemented direct Numba-optimized weighted calculation |
102 | | -- **Plot Layout**: Fixed `plot_performance` to generate single plot (removed unwanted second subplot) |
103 | | -- **Return Values**: Corrected return signature to `(fig, ax, gini)` for single axis |
104 | | -- **Perfect Line**: Fixed continuous target perfect line to sort by true target values (not straight line to (1,1)) |
105 | | - |
106 | | -### 🎨 Styling & Design |
107 | | -- **Consistent CAP/Power Curve Styling**: |
108 | | - - Arial font family for all text |
109 | | - - Font size 12 for axis labels, 14 for titles, 10 for legend |
110 | | - - Specific ticks: `np.arange(0, 1.1, 0.1)` for both axes |
111 | | - - "Fraction of population" (x-axis), "Fraction of target" (y-axis) |
112 | | - - "Crystal Ball" legend for perfect line (dodgerblue) |
113 | | - - Black dotted random line |
114 | | - - Default colors: `["#69db7c", "#55d3ed", "#ffa94d", "#c430c1", "#ff6b6b", "#4dabf7"]` |
115 | | - - Default figsize: `(6, 5)` |
116 | | - |
117 | | -- **HTML Table Styling**: |
118 | | - - Light mode only (no dark mode mixing) |
119 | | - - Inter font family |
120 | | - - Subtle backgrounds and borders |
121 | | - - Gradient highlighting with `!important` for proper rendering |
122 | | - - 16px border radius for badges |
123 | | - - Smooth hover transitions |
124 | | - |
125 | | -### 🔧 Configuration |
126 | | -- **Moved Sourcery Config**: Migrated `.sourcery.yaml` to `pyproject.toml` under `[tool.sourcery]` section |
| 22 | +### 📊 Documentation |
| 23 | +- Added: `docs/marginal_somersd_guide.md` - Comprehensive guide with algorithm flowchart and variance decomposition diagrams |
| 24 | +- Added: `examples/msd_feature_selection.ipynb` - Example notebook demonstrating MSD feature selection |
127 | 25 |
|
128 | 26 | ### 📦 Dependencies |
129 | | -- **Added**: `loguru>=0.7.0` for enhanced logging in tests |
130 | | -- **Added**: `matplotlib>=3.5.0` (already in examples dependencies) |
131 | | - |
132 | | -### 🧪 Testing |
133 | | -- **Enhanced `test_fast_somersd.py`**: |
134 | | - - Added `test_weighted_somersd()` to verify weighted implementation |
135 | | - - Integrated `loguru` with `RichHandler` for better test output |
136 | | - - All tests passing ✅ |
137 | | - |
138 | | -### 🚀 Installation |
139 | | -This alpha version can be installed directly from the Git branch: |
140 | | - |
141 | | -```bash |
142 | | -# Install from alpha branch |
143 | | -uv add "fastwoe @ git+https://github.com/xRiskLab/fastwoe.git@alpha-0.1.6a0" |
144 | | -``` |
145 | | - |
146 | | -### ⚠️ Breaking Changes |
147 | | -None - all changes are additive and backward compatible. |
148 | | - |
149 | | -### 📝 Notes |
150 | | -- This is an alpha release for testing new visualization and display features |
151 | | -- Feedback welcome on styling, API design, and functionality |
152 | | -- Stable release (0.1.6) will follow after testing period |
| 27 | +- Added: `loguru>=0.7.0`, `matplotlib>=3.5.0` |
153 | 28 |
|
154 | | -## Version 0.1.5 (2024-12-09) |
| 29 | +## Version 0.1.5 (2025-12-09) |
155 | 30 |
|
156 | 31 | **Performance Fix & Code Cleanup**: Eliminated DataFrame fragmentation warning and removed debug statements |
157 | 32 |
|
|
0 commit comments