Skip to content

Commit ff0d944

Browse files
committed
Release v0.1.4.post2: Monotonic Constraints Support
- Add comprehensive monotonic constraints support across all binning methods - Tree method: Native scikit-learn monotonic constraints - KBins/FAISS methods: Isotonic regression post-processing - Add extensive test coverage for all constraint scenarios - Update documentation and examples - Fix FAISS API compatibility and type checking issues - Add comprehensive example (examples/fastwoe_monotonic.py)
1 parent b5bdca3 commit ff0d944

17 files changed

Lines changed: 1431 additions & 212 deletions

.github/workflows/typecheck.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ jobs:
3333
- name: Verify installation
3434
run: |
3535
uv run python -c "import fastwoe; print('FastWoe installed')"
36-
uv run python -c "import pyrefly; print('pyrefly installed')"
36+
uv run python -c "import ty; print('ty installed')"
3737
3838
- name: Run CI checks (format, lint, typecheck)
3939
run: |

CHANGELOG.md

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,75 @@
11
# Changelog
22

3+
## Version 0.1.4.post2 (Current)
4+
5+
**Monotonic Constraints Support**: Complete implementation across all binning methods
6+
7+
- **New Features**:
8+
- **Monotonic Constraints**: Added comprehensive monotonic constraints support for credit scoring compliance
9+
- **Tree Method**: Native scikit-learn monotonic constraints (`monotonic_cst` parameter)
10+
- **KBins Method**: Isotonic regression post-processing to enforce constraints
11+
- **FAISS KMeans Method**: Isotonic regression post-processing to enforce constraints
12+
- **Constraint Values**: `1` (increasing), `-1` (decreasing), `0` (no constraint)
13+
- **Validation**: Comprehensive input validation with clear error messages
14+
- **Binning Info**: Monotonic constraints stored in `binning_info_` and displayed in summaries
15+
- **Isotonic Regression Integration**: Added `_apply_isotonic_constraints` method
16+
- Uses scikit-learn's `IsotonicRegression` for KBins and FAISS methods
17+
- Enforces monotonic patterns on WOE values after initial binning
18+
- Handles bin center extraction and constraint application
19+
- **Comprehensive Testing**: Added extensive test coverage for monotonic constraints
20+
- Tests for all binning methods (Tree, KBins, FAISS)
21+
- Tests for multiclass and continuous targets
22+
- Tests for edge cases and validation
23+
- Tests for backward compatibility
24+
- **Enhanced Documentation**: Updated README and examples
25+
- Added detailed monotonic constraints section with examples
26+
- Updated API reference with `monotonic_cst` parameter
27+
- Created comprehensive example (`examples/fastwoe_monotonic.py`)
28+
29+
- **API Changes**:
30+
- **FastWoe**: Added `monotonic_cst` parameter to constructor
31+
- Type: `dict[str, int]` mapping feature names to constraint values
32+
- Default: `None` (no constraints applied)
33+
- Validation: Ensures valid constraint values (-1, 0, 1) and feature names
34+
- **Binning Summary**: Added `monotonic_constraint` column to `get_binning_summary()`
35+
- **Binning Info**: Added `monotonic_constraint` field to `binning_info_`
36+
37+
- **Bug Fixes**:
38+
- Fixed bare `except:` clause in isotonic constraints implementation
39+
- Updated test for unsupported methods warning (now tests invalid method instead)
40+
- Fixed FAISS API compatibility issue with `index.search` method
41+
42+
- **Examples**:
43+
- **New**: `examples/fastwoe_monotonic.py` - Comprehensive monotonic constraints demonstration
44+
- Shows all binning methods with constraints
45+
- Compares KBins strategies (uniform, quantile, kmeans)
46+
- Analyzes monotonic patterns and performance
47+
- Provides readable table outputs instead of plots
48+
49+
- **Technical Details**:
50+
- **Tree Method**: Uses scikit-learn's native `monotonic_cst` parameter
51+
- **KBins/FAISS Methods**: Applies isotonic regression after WOE calculation
52+
- **Performance**: Constraints may slightly affect performance but ensure business logic compliance
53+
- **Compatibility**: Fully backward compatible - existing code works unchanged
54+
55+
## Version 0.1.4.post1
56+
57+
**Bug Fix Release**: Fixed pyrefly type checking comments appearing in output
58+
59+
- **Bug Fixes**:
60+
- Fixed `# pyrefly: ignore` comments being printed in `WeightOfEvidence.summary()` output
61+
- Migrated from `pyrefly` to `ty` type checker
62+
- Updated all type checking comments to use standard `# type: ignore[error-code]` format
63+
- Fixed f-string formatting issues in summary method
64+
65+
- **Infrastructure**:
66+
- **Type Checking Migration**: Complete migration from `pyrefly` to `ty`
67+
- Moved `ty.toml` configuration into `pyproject.toml`
68+
- Updated GitHub Actions workflow to use `ty`
69+
- Updated Makefile targets for `ty` commands
70+
- Updated documentation for new type checking setup
71+
- **Dependencies**: Updated dev dependencies to use `ty>=0.0.1a21`
72+
373
## Version 0.1.4 (Current)
474

575
**Multiclass Support & Enhanced Tree Binning**: Major feature additions and API improvements

Makefile

Lines changed: 5 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -23,18 +23,8 @@ format-check: ## Check code formatting
2323
uv run ruff format --check fastwoe
2424

2525
typecheck: ## Run type checking (lenient mode)
26-
@echo "Running pyrefly type checking..."
27-
@uv run pyrefly check fastwoe/fastwoe.py fastwoe/interpret_fastwoe.py \
28-
--ignore missing-attribute \
29-
--ignore bad-argument-type \
30-
--ignore unsupported-operation \
31-
--ignore not-iterable \
32-
--ignore no-matching-overload \
33-
--ignore missing-argument \
34-
--ignore bad-return \
35-
--ignore bad-assignment \
36-
--ignore missing-module-attribute \
37-
--summary=none || { \
26+
@echo "Running ty type checking..."
27+
@uv run ty check fastwoe/fastwoe.py fastwoe/interpret_fastwoe.py || { \
3828
echo "⚠️ Type checking completed with some remaining errors."; \
3929
echo " Remaining errors are expected for pandas/numpy/faiss usage."; \
4030
echo "✅ Development mode: Treating expected type errors as success"; \
@@ -43,19 +33,9 @@ typecheck: ## Run type checking (lenient mode)
4333
@echo "✅ Type checking passed"
4434

4535
typecheck-strict: ## Run type checking (strict mode)
46-
@echo "Running pyrefly type checking (strict mode)..."
36+
@echo "Running ty type checking (strict mode)..."
4737
@echo "🔒 Strict mode enabled"
48-
@uv run pyrefly check fastwoe/fastwoe.py fastwoe/interpret_fastwoe.py \
49-
--ignore missing-attribute \
50-
--ignore bad-argument-type \
51-
--ignore unsupported-operation \
52-
--ignore not-iterable \
53-
--ignore no-matching-overload \
54-
--ignore missing-argument \
55-
--ignore bad-return \
56-
--ignore bad-assignment \
57-
--ignore missing-module-attribute \
58-
--summary=full || { \
38+
@uv run ty check fastwoe/fastwoe.py fastwoe/interpret_fastwoe.py || { \
5939
echo "⚠️ Strict type checking found issues (expected for pandas/numpy code)"; \
6040
exit 1; \
6141
}
@@ -73,17 +53,7 @@ check-all: format-check lint typecheck ## Run all checks (format, lint, typeche
7353
ci-check: format-check lint ## Run CI checks (without strict type checking)
7454
@echo "🔍 Running type checking in CI mode..."
7555
@echo "This mode ignores expected pandas/numpy/faiss type complexity"
76-
@uv run pyrefly check fastwoe/fastwoe.py fastwoe/interpret_fastwoe.py \
77-
--ignore missing-attribute \
78-
--ignore bad-argument-type \
79-
--ignore unsupported-operation \
80-
--ignore not-iterable \
81-
--ignore no-matching-overload \
82-
--ignore missing-argument \
83-
--ignore bad-return \
84-
--ignore bad-assignment \
85-
--ignore missing-module-attribute \
86-
--summary=none || { \
56+
@uv run ty check fastwoe/fastwoe.py fastwoe/interpret_fastwoe.py || { \
8757
echo "⚠️ Type checking completed with some remaining errors."; \
8858
echo " Remaining errors are expected for pandas/numpy/faiss usage."; \
8959
echo "🔧 CI mode: Treating expected type errors as success"; \

README.md

Lines changed: 83 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ FastWoe is a Python library for efficient **Weight of Evidence (WOE)** encoding
2020
- **IV Standard Errors**: Statistical significance testing for Information Value with confidence intervals
2121
- **Cardinality Control**: Built-in preprocessing to handle high-cardinality categorical features
2222
- **Intelligent Numerical Binning**: Support for traditional binning, decision tree-based binning, and FAISS KMeans clustering
23+
- **Monotonic Constraints**: Enforce business logic constraints for credit scoring and regulatory compliance
2324
- **Binning Summaries**: Feature-level binning statistics including Gini score and Information Value (IV)
2425
- **Compatible with scikit-learn**: Follows scikit-learn's preprocessing transformer interface
2526
- **Uncertainty Quantification**: Combines Alan Turing's factor principle with Maximum Likelihood theory (see [paper](docs/woe_st_errors.md))
@@ -379,6 +380,85 @@ pipeline = Pipeline([
379380
pipeline.fit(data[['category', 'high_card_cat']], data['target'])
380381
```
381382
383+
## 🎯 Monotonic Constraints for Credit Scoring
384+
385+
FastWoe supports **monotonic constraints** for numerical features, ensuring that WOE values follow business logic requirements. This is particularly important for credit scoring and regulatory compliance.
386+
387+
### When to Use Monotonic Constraints
388+
389+
- **Credit Scoring**: Higher income should lead to lower risk
390+
- **Age-based Risk**: Higher age might lead to higher risk (depending on context)
391+
- **Credit Score**: Higher credit scores should lead to lower risk
392+
- **Regulatory Compliance**: When business rules require monotonic relationships
393+
394+
### Example Usage
395+
396+
```python
397+
import pandas as pd
398+
import numpy as np
399+
from fastwoe import FastWoe
400+
401+
# Create sample credit scoring data
402+
np.random.seed(42)
403+
n_samples = 1000
404+
405+
# Income: higher income -> lower risk (decreasing constraint)
406+
income = np.random.lognormal(mean=10, sigma=0.5, size=n_samples)
407+
income_risk = 1 / (1 + np.exp((income - np.median(income)) / 20))
408+
409+
# Age: higher age -> higher risk (increasing constraint)
410+
age = np.random.normal(35, 12, n_samples)
411+
age_risk = 1 / (1 + np.exp(-(age - 35) / 8))
412+
413+
# Credit score: higher score -> lower risk (decreasing constraint)
414+
credit_score = np.random.normal(650, 100, n_samples)
415+
credit_score = np.clip(credit_score, 300, 850)
416+
credit_risk = 1 / (1 + np.exp((credit_score - 650) / 50))
417+
418+
# Combine risks
419+
combined_risk = (income_risk + age_risk + credit_risk) / 3
420+
y = (combined_risk > 0.5).astype(int)
421+
422+
X = pd.DataFrame({
423+
'income': income,
424+
'age': age,
425+
'credit_score': credit_score
426+
})
427+
428+
# Apply monotonic constraints
429+
woe_encoder = FastWoe(
430+
binning_method="tree",
431+
monotonic_cst={
432+
"income": -1, # Decreasing: higher income -> lower risk
433+
"age": 1, # Increasing: higher age -> higher risk
434+
"credit_score": -1 # Decreasing: higher score -> lower risk
435+
},
436+
numerical_threshold=10
437+
)
438+
439+
woe_encoder.fit(X, y)
440+
441+
# Check that constraints were applied
442+
summary = woe_encoder.get_binning_summary()
443+
print(summary[['feature', 'monotonic_constraint']])
444+
```
445+
446+
### Constraint Values
447+
448+
- `1`: Increasing constraint (higher values → higher risk)
449+
- `-1`: Decreasing constraint (higher values → lower risk)
450+
- `0`: No constraint (default)
451+
452+
### Important Notes
453+
454+
- **Tree method**: Uses native scikit-learn monotonic constraints
455+
- **KBins & FAISS methods**: Uses isotonic regression to enforce constraints
456+
- Constraints ensure WOE values follow the specified monotonic pattern
457+
- Performance may be slightly different but more interpretable
458+
- Essential for regulatory compliance in credit scoring
459+
460+
For a complete example, see [examples/monotonic_constraints_example.py](examples/monotonic_constraints_example.py).
461+
382462
## 📋 API Reference
383463
384464
### FastWoe Class
@@ -391,6 +471,7 @@ pipeline.fit(data[['category', 'high_card_cat']], data['target'])
391471
- `tree_estimator` (estimator): Custom tree estimator for binning (when binning_method="tree")
392472
- `tree_kwargs` (dict): Parameters for tree estimator
393473
- `faiss_kwargs` (dict): Parameters for FAISS KMeans (when binning_method="faiss_kmeans")
474+
- `monotonic_cst` (dict): Monotonic constraints for numerical features. Maps feature names to constraint values: 1 (increasing), -1 (decreasing), 0 (no constraint). Supported with all binning methods: tree (native), kbins/faiss_kmeans (isotonic regression).
394475
395476
#### Key Methods
396477
- `fit(X, y)`: Fit the WOE encoder
@@ -565,10 +646,10 @@ act --container-architecture linux/amd64 -j type-check -W .github/workflows/type
565646
See [Local Testing with Act](docs/dev/act-local-testing.md) for comprehensive documentation.
566647
567648
**Type Checking Notes:**
568-
- FastWoe uses [pyrefly](https://pyrefly.org/) for type checking via `scripts/typecheck.py`
649+
- FastWoe uses [ty](https://github.com/astral-sh/ty) for type checking via `make typecheck`
569650
- Many type errors are expected due to pandas/numpy dynamic typing
570651
- CI mode treats expected pandas/numpy type issues as success
571-
- Use `PYREFLY_STRICT=true` to fail on any type errors
652+
- Use `make typecheck-strict` to fail on any type errors
572653
573654
### Building the Package
574655

docs/dev/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ This directory contains documentation for developers and contributors working on
66

77
- **[act-local-testing.md](act-local-testing.md)** - Guide for testing GitHub Actions workflows locally using act
88
- **[ci-type-checking.md](ci-type-checking.md)** - CI type checking setup and configuration
9-
- **[type-checking.md](type-checking.md)** - Type checking setup and pyrefly configuration
9+
- **[type-checking.md](type-checking.md)** - Type checking setup and ty configuration
1010

1111
## User Documentation
1212

docs/dev/act-local-testing.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -168,7 +168,7 @@ act --container-architecture linux/amd64 -W .github/workflows/typecheck.yml --dr
168168
# - Python 3.11 setup
169169
# - uv installation
170170
# - Dependencies installation
171-
# - FastWoe & pyrefly verification
171+
# - FastWoe & ty verification
172172
# - CI checks (format, lint, typecheck)
173173
# - Strict type checking
174174
```
@@ -302,7 +302,7 @@ act --container-architecture linux/amd64 -W .github/workflows/compatibility.yml
302302
Your project includes these workflows:
303303

304304
- **`ci.yml`**: Main CI pipeline with tests, linting, and type checking
305-
- **`typecheck.yml`**: Dedicated type checking with pyrefly integration
305+
- **`typecheck.yml`**: Dedicated type checking with ty integration
306306
- **`compatibility.yml`**: Python version compatibility testing
307307
- **`release.yml`**: Release automation and publishing
308308

docs/dev/ci-type-checking.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ CI=true uv run type-check
2424
- Build continues
2525

2626
**What it does**:
27-
- Runs pyrefly with comprehensive ignore flags
27+
- Runs ty with comprehensive ignore flags
2828
- Detects CI environment automatically
2929
- Treats expected data science library type issues as success
3030
- Only fails on genuine type errors in your code
@@ -46,7 +46,7 @@ PYREFLY_STRICT=true uv run type-check
4646

4747
The CI mode will only fail if:
4848

49-
1. **Script execution fails**: pyrefly not installed, import errors, etc.
49+
1. **Script execution fails**: ty not installed, import errors, etc.
5050
2. **New genuine type errors**: Errors in your code logic (not pandas/numpy)
5151
3. **Missing dependencies**: Required packages not available
5252

@@ -85,8 +85,8 @@ env:
8585
8686
### Making CI More Lenient
8787
The current configuration is already lenient. If you need even more leniency, you can:
88-
1. Add more ignore flags in `fastwoe/scripts/type_check.py`
89-
2. Exclude more files in the pyrefly configuration
88+
1. Add more ignore flags in the ty configuration in `pyproject.toml`
89+
2. Exclude more files in the ty configuration
9090

9191
## Summary
9292

0 commit comments

Comments
 (0)