Skip to content

Commit f329678

Browse files
committed
Merge PR #4: Interval scorecard functionality
Integrates the interval scorecard feature that simplifies complex XGBoost tree rules into industry-standard intervals. This feature provides significant rule reduction while maintaining model accuracy.
2 parents 4c6bd80 + 8226717 commit f329678

File tree

5 files changed

+1661
-2
lines changed

5 files changed

+1661
-2
lines changed

README.md

Lines changed: 50 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -142,6 +142,32 @@ sql_query = scorecard_constructor.generate_sql_query(table_name='my_table')
142142
print(sql_query)
143143
```
144144

145+
### Interval Scorecards 📊
146+
147+
Convert complex tree-based scorecards into simplified interval-based rules. This feature requires `max_depth=1` models and follows industry standard practices (Siddiqi, 2017):
148+
149+
```python
150+
# After creating a standard scorecard with points (see above)
151+
152+
# Build interval scorecard - simplifies complex rules into intervals
153+
interval_scorecard = scorecard_constructor.construct_scorecard_by_intervals(add_stats=True)
154+
155+
print(f"Rule reduction: {len(xgb_scorecard_with_points)}{len(interval_scorecard)} rules")
156+
print("\nInterval format:")
157+
print(interval_scorecard[['Feature', 'Bin', 'Points', 'WOE']].head())
158+
159+
# Add Points at Even Odds/Points to Double the Odds (PEO/PDO)
160+
peo_pdo_scorecard = scorecard_constructor.create_points_peo_pdo(peo=600, pdo=50)
161+
print("\nPEO/PDO Points:")
162+
print(peo_pdo_scorecard[['Feature', 'Bin', 'Points_PEO_PDO']].head())
163+
```
164+
165+
**Key Benefits:**
166+
- **Simplified Rules**: Transform complex tree conditions into simple intervals like `[70.8, 80.5)`
167+
- **Rule Reduction**: Typically 60-80% fewer rules while maintaining accuracy
168+
- **Industry Standard**: Follows credit scoring best practices
169+
- **Interpretable**: Easy to understand and implement in production systems
170+
145171
### XGBoost Preprocessing
146172

147173
For handling categorical features in XGBoost, you can use the `DataPreprocessor`:
@@ -403,6 +429,23 @@ A class for generating a scorecard from a trained XGBoost model. The methodology
403429
- **Returns**:
404430
- `str`: The final SQL query for deploying the scorecard.
405431

432+
8. `construct_scorecard_by_intervals(add_stats=True) -> pd.DataFrame`:
433+
- Constructs a scorecard grouped by intervals of the type [a, b). Requires max_depth=1 models.
434+
- **Parameters**:
435+
- `add_stats` (bool, optional): Whether to include WOE, IV, and count statistics. Default is True.
436+
- **Returns**:
437+
- `pd.DataFrame`: The interval-based scorecard.
438+
439+
9. `create_points_peo_pdo(peo: int, pdo: int, precision_points: int = 0, scorecard: pd.DataFrame = None) -> pd.DataFrame`:
440+
- Creates Points at Even Odds/Points to Double the Odds (PEO/PDO) on interval scorecards.
441+
- **Parameters**:
442+
- `peo` (int): Points at Even Odds.
443+
- `pdo` (int): Points to Double the Odds.
444+
- `precision_points` (int, optional): Decimal precision for points. Default is 0.
445+
- `scorecard` (pd.DataFrame, optional): Specific scorecard to use. Default uses interval scorecard.
446+
- **Returns**:
447+
- `pd.DataFrame`: Scorecard with PEO/PDO points.
448+
406449
### `xbooster.explainer` - XGBoost Scorecard Explainer
407450

408451
This module provides functionalities for explaining XGBoost scorecards, including methods to extract split information, build interaction splits, visualize tree structures, plot feature importances, and more.
@@ -483,10 +526,16 @@ Contributions are welcome! For bug reports or feature requests, please open an i
483526
For code contributions, please open a pull request.
484527

485528
## Version
486-
Current version: 0.2.5
529+
Current version: 0.2.6
487530

488531
## Changelog
489532

533+
### [0.2.6] - 2025-08-30
534+
- Added interval scorecard functionality for XGBoost models with `max_depth=1`
535+
- New methods: `construct_scorecard_by_intervals()` and `create_points_peo_pdo()`
536+
- Simplifies complex tree rules into interpretable intervals following industry standards (Siddiqi, 2017)
537+
- Typically achieves 60-80% rule reduction while maintaining accuracy
538+
490539
### [0.2.5] - 2025-04-19
491540
- Minor changes in `catboost_wrapper.py` and `cb_constructor.py` to improve the scorecard generation.
492541

0 commit comments

Comments
 (0)