- CatBoost Naming Standardization: Replaced
LeafValuewithXAddEvidencethroughout CatBoost implementation- Standardized naming to match XGBoost and LightGBM implementations
- Updated all CatBoost-related files:
catboost_scorecard.py,catboost_wrapper.py,cb_constructor.py - Updated all tests and documentation
- README Documentation: Corrected CatBoost depth requirement statement
- Changed from "Only supports depth=1" to "depth=1 is recommended for better interpretability"
- Code actually supports any tree depth (as long as trees are complete binary)
- Updated code examples to use
XAddEvidenceinstead ofLeafValue
- PyPI Publish Workflow: Added automated PyPI publishing workflow (
.github/workflows/publish.yml)- Supports both release events and manual workflow dispatch
- Uses trusted publishing (OpenID Connect) for secure PyPI uploads
- Automatically uploads distribution files to GitHub releases
- All 106 tests passing
- Version updated from 0.2.7rc2 to 0.2.7 (stable release)
- LightGBM support is now stable (previously release candidate)
- Complete Implementation: Added missing
create_points()andpredict_score()methods- Previous rc1 release had incomplete squash merge
- Now includes all critical bug fixes and implementations
- LightGBM Scorecard Support: Complete implementation of
LGBScorecardConstructor- Implemented
create_points()with proper base_score normalization - Implemented
predict_score()andpredict_scores()for scorecard predictions - Added
use_base_scoreparameter for flexible base score handling - Full parity with XGBoost scorecard functionality
- Implemented
- Critical Bug Fix: Corrected leaf ID mapping in
extract_leaf_weights()- Changed from
cumcount()to extracting actual leaf ID from node_index string - Fixes 55% Gini loss (0.40 → 0.90) in scorecard predictions
- Ensures correct mapping between LightGBM's absolute leaf IDs and relative indices
- Changed from
- Base Score Normalization: Proper handling of LightGBM's base score
- Subtract base_score from Tree 0 leaves to balance tree contributions
- Add logit(base_score) during scaling to distribute across all trees
- Prevents first tree from getting disproportional weight
- Simplified Score Types: Only
XAddEvidencesupported for LightGBM- Removed WOE support (ill-defined for LightGBM's sklearn API)
- Cleaner, more maintainable implementation
- Enhanced Documentation: Updated docstrings and examples
- Added comprehensive LightGBM getting-started notebook
- Explained base_score handling differences from XGBoost
- All 106 tests passing (9 LightGBM-specific tests)
- Scorecard Gini: 0.9020 vs Model Gini: 0.9021 (perfect preservation)
- Proper handling of LightGBM's sklearn API vs internal booster API
- Related to PR #8
- LightGBM Support (Alpha): Initial implementation of
LGBScorecardConstructorfor LightGBM models- Implemented
extract_leaf_weights()method for parsing LightGBM tree structure - Implemented
get_leafs()method for leaf indices and margins prediction - Added comprehensive test suite with 9 tests (all passing)
- Created example demonstrating implemented functionality
- Status: Alpha release - 2 of 5 core methods implemented
- Community: Reference implementation for issue #7 (@RektPunk)
- Implemented
- Development Workflow: Consolidated shell scripts into enhanced Makefile
- Removed
run-ci-checks.shandrun-act.shscripts - Enhanced Makefile with colored output and Docker checks
- Added new targets:
make ci-check,make act-local,make test-quick - Single entry point for all development tasks
make ci-checkruns fast local checks (no Docker)make act-*commands use Docker for GitHub Actions simulation
- Removed
- CI/CD: Added missing
lightgbm>=4.0.0,<5.0.0dependency - Type Checking: Added LightGBM type stubs and configured ignore rules
- GitHub Actions: Fixed act configuration by removing unsupported flags
- LightGBM constructor follows same pattern as XGBoost/CatBoost implementations
- Column naming unified to
XAddEvidencefor consistency across all constructors - Test suite expanded from 95 to 104 tests
- All tests passing in local and CI environments
# Install latest alpha release
pip install git+https://github.com/xRiskLab/xBooster.git@v0.2.7a2- Code Optimization: Simplified
get_leafs()andconstruct_scorecard()methods inXGBScorecardConstructor(PR #6)- Removed special-case branching for first iteration
- Precomputes full leaf index matrix once instead of repeated predictions
- Eliminates redundant DataFrame concatenations
- Net reduction of 40 lines of code while maintaining identical functionality
- Comprehensive Regression Tests: Added 13 new tests to verify code refactoring produces identical outputs
- Build System Improvements: Modernized hatchling configuration
- Simplified version management in
__init__.py - Removed setuptools legacy configuration
- Added explicit egg-info exclusion
- Simplified version management in
- Code Quality Tools: Added
prekandtytype checker configurations - Type Stubs Directory: Created
typings/directory for custom type definitions
- Improved
.gitignoreto properly exclude build artifacts and egg-info files
- All 95 tests passing with no regressions
- Leaf indices now stored as float32 (XGBoost's default) but represent whole numbers
- Float precision differences negligible (< 1e-6)
- Performance maintained across all operations
- XGBoost Compatibility: Extended dependency range from
>=2.0.0,<3.0.0to>=2.0.0,<4.0.0 - Test Precision: Updated
test_extract_model_paramto handle XGBoost 3.0.5 precision differences - CI/CD Enhancement: Added comprehensive XGBoost version matrix testing (2.1.4, 3.0.5, latest)
- New Test Suite: Added
test_xgboost_compatibility.pywith 8 comprehensive compatibility tests - Enhanced Workflows: Updated GitHub Actions to test across multiple XGBoost versions
- Better Error Handling: Improved Pylint configuration for virtual environment compatibility
- ✅ Verified compatibility with XGBoost 3.0.5
- ✅ Backward compatible with XGBoost 2.x versions
- ✅ All existing functionality remains unchanged
- ✅ No breaking changes for existing users
- Fixed precision differences in
base_scoreparameter extraction between XGBoost versions - Enhanced CI pipeline to catch compatibility issues early
- Improved development environment setup with better Pylint integration
- Added interval scorecard functionality for XGBoost models with
max_depth=1 - New methods:
construct_scorecard_by_intervals()andcreate_points_peo_pdo() - Simplifies complex tree rules into interpretable intervals following industry standards (Siddiqi, 2017)
- Typically achieves 60-80% rule reduction while maintaining accuracy
- Minor changes in
catboost_wrapper.pyandcb_constructor.pyto improve the scorecard generation.
- Changed the build distribution in pyproject.toml.
- Added support for CatBoost classification models and switch to
uvfor packaging. - Python version requirement updated to 3.10-3.11.
- Updates in
explainer.pymodule to improve kwargs handling and minor changes.
- Updates of dependencies
- Added tree visualization class (
explainer.py) - Updated the local explanation algorithm for models with a depth > 1 (
explainer.py) - Added a categorical preprocessor (
_utils.py)
- Initial release