Skip to content

Commit bf96e91

Browse files
committed
Release v0.1.5: stable release
1 parent f424b06 commit bf96e91

6 files changed

Lines changed: 829 additions & 568 deletions

File tree

.github/workflows/release.yml

Lines changed: 59 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ on:
77
workflow_dispatch:
88
inputs:
99
version:
10-
description: 'Version to release (e.g., 0.1.0)'
10+
description: 'Version to release (e.g., 0.1.5)'
1111
required: true
1212
type: string
1313

@@ -65,52 +65,61 @@ jobs:
6565
name: dist
6666
path: dist/
6767

68-
# publish:
69-
# needs: build
70-
# runs-on: ubuntu-latest
71-
# permissions:
72-
# id-token: write # IMPORTANT: this permission is mandatory for trusted publishing
73-
# steps:
74-
# - name: Download build artifacts
75-
# uses: actions/download-artifact@v4
76-
# with:
77-
# name: dist
78-
# path: dist/
79-
#
80-
# - name: Publish to PyPI
81-
# uses: pypa/gh-action-pypi-publish@release/v1
82-
# # Uncomment when ready to publish to PyPI
83-
# # with:
84-
# # password: ${{ secrets.PYPI_API_TOKEN }}
85-
# # Or use trusted publishing (recommended):
86-
# # with:
87-
# # repository-url: https://upload.pypi.org/legacy/
88-
89-
# create-release:
90-
# needs: build
91-
# runs-on: ubuntu-latest
92-
# permissions:
93-
# contents: write
94-
# steps:
95-
# - uses: actions/checkout@v4
96-
# with:
97-
# fetch-depth: 0
98-
99-
# - name: Create GitHub Release
100-
# uses: ncipollo/release-action@v1
101-
# with:
102-
# tag: ${{ github.ref_name }}
103-
# name: Release ${{ github.ref_name }}
104-
# body: |
105-
# ## Changes
106-
107-
# - See [CHANGELOG.md](CHANGELOG.md) for detailed changes
108-
109-
# ## Installation
110-
111-
# ```bash
112-
# pip install fastwoe==${{ github.ref_name }}
113-
# ```
114-
# draft: false
115-
# prerelease: false
116-
# token: ${{ secrets.GITHUB_TOKEN }}
68+
publish:
69+
needs: build
70+
runs-on: ubuntu-latest
71+
steps:
72+
- name: Download build artifacts
73+
uses: actions/download-artifact@v4
74+
with:
75+
name: dist
76+
path: dist/
77+
78+
- name: Publish to PyPI
79+
uses: pypa/gh-action-pypi-publish@release/v1
80+
with:
81+
password: ${{ secrets.PYPI_API_TOKEN }}
82+
83+
create-release:
84+
needs: build
85+
runs-on: ubuntu-latest
86+
permissions:
87+
contents: write
88+
steps:
89+
- uses: actions/checkout@v4
90+
with:
91+
fetch-depth: 0
92+
93+
- name: Extract version from tag
94+
id: get_version
95+
run: |
96+
if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
97+
echo "version=${{ github.event.inputs.version }}" >> $GITHUB_OUTPUT
98+
echo "tag=v${{ github.event.inputs.version }}" >> $GITHUB_OUTPUT
99+
else
100+
echo "version=${GITHUB_REF#refs/tags/v}" >> $GITHUB_OUTPUT
101+
echo "tag=${GITHUB_REF#refs/tags/}" >> $GITHUB_OUTPUT
102+
fi
103+
104+
- name: Create GitHub Release
105+
uses: ncipollo/release-action@v1
106+
with:
107+
tag: ${{ steps.get_version.outputs.tag }}
108+
name: Release ${{ steps.get_version.outputs.tag }}
109+
body: |
110+
## Changes
111+
112+
See [CHANGELOG.md](https://github.com/xRiskLab/fastwoe/blob/main/CHANGELOG.md) for detailed changes.
113+
114+
## Installation
115+
116+
```bash
117+
pip install fastwoe==${{ steps.get_version.outputs.version }}
118+
```
119+
120+
## What's New in v${{ steps.get_version.outputs.version }}
121+
122+
Check the [CHANGELOG](https://github.com/xRiskLab/fastwoe/blob/main/CHANGELOG.md#version-${{ steps.get_version.outputs.version }}) for full details.
123+
draft: false
124+
prerelease: false
125+
token: ${{ secrets.GITHUB_TOKEN }}

CHANGELOG.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,29 @@
11
# Changelog
22

3+
## Version 0.1.5 (2024-12-09)
4+
5+
**Performance Fix & Code Cleanup**: Eliminated DataFrame fragmentation warning and removed debug statements
6+
7+
- **Bug Fixes**:
8+
- **DataFrame Fragmentation Warning**: Fixed `PerformanceWarning: DataFrame is highly fragmented` in `transform()` method
9+
- Root cause: Iteratively adding columns to DataFrame with `woe_df[col] = woe_values` caused memory fragmentation
10+
- Solution: Collect all WOE columns in a dictionary first, then create DataFrame in one operation
11+
- Performance improvement: Eliminates repeated memory reallocation during transform
12+
- User impact: No more annoying performance warnings when transforming data
13+
- **Debug Print Statement**: Removed leftover debug `print("FAISS is available:", faiss)` statement in FAISS KMeans binning
14+
- Cleaned up console output when using `binning_method='faiss_kmeans'`
15+
- **Code Quality**: Improved transform method efficiency following pandas best practices
16+
17+
- **Technical Details**:
18+
- Changed from: `for col in columns: woe_df[col] = values` (causes fragmentation)
19+
- Changed to: `woe_columns = {col: values for col in columns}; woe_df = pd.DataFrame(woe_columns)` (single allocation)
20+
- Follows pandas recommendation to use `pd.concat(axis=1)` or dict-based DataFrame construction
21+
22+
- **Testing**:
23+
- All 102 tests passing successfully ✅
24+
- Verified no fragmentation warnings with multi-feature datasets
25+
- Backward compatible: transform output unchanged
26+
327
## Version 0.1.5rc1 (2025-10-26)
428

529
**Clean API Refactoring & Pythonic Input Handling**: Release candidate with major UX improvements

examples/fastwoe_multiclass.ipynb

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,10 @@
1010
"Author: https://www.github.com/xRiskLab\n",
1111
"\n",
1212
"This notebook demonstrates how to use FastWoe with a multiclass target in a row-level format. The target has three classes:\n",
13+
"\n",
1314
"- `0`: No Default\n",
1415
"- `1`: UTP Default\n",
15-
"- `2`: DPD Default"
16+
"- `2`: DPD Default\n"
1617
]
1718
},
1819
{
@@ -400,7 +401,7 @@
400401
],
401402
"metadata": {
402403
"kernelspec": {
403-
"display_name": "fastwoe",
404+
"display_name": ".venv (3.11.8)",
404405
"language": "python",
405406
"name": "python3"
406407
},

fastwoe/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
from .fastwoe import FastWoe, WoePreprocessor
1414
from .interpret_fastwoe import WeightOfEvidence
1515

16-
__version__ = "0.1.5rc1"
16+
__version__ = "0.1.5"
1717
__author__ = "xRiskLab"
1818
__email__ = "contact@xrisklab.ai"
1919

fastwoe/fastwoe.py

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1009,6 +1009,8 @@ def transform(self, X: Union[pd.DataFrame, np.ndarray, pd.Series]) -> pd.DataFra
10091009
if mask_missing.any():
10101010
X_processed.loc[mask_missing, col] = "Missing"
10111011

1012+
# Collect all WOE columns first to avoid DataFrame fragmentation
1013+
woe_columns = {}
10121014
for col in X_processed.columns:
10131015
if self.is_multiclass_target:
10141016
return self._transform_multiclass(X_processed)
@@ -1027,7 +1029,11 @@ def transform(self, X: Union[pd.DataFrame, np.ndarray, pd.Series]) -> pd.DataFra
10271029
# Handle unseen categories - use default WOE or prior
10281030
woe_values.append(0.0) # or np.log(self.odds_prior_) for prior
10291031

1030-
woe_df[col] = woe_values
1032+
woe_columns[col] = woe_values
1033+
1034+
# Create DataFrame from dict to avoid fragmentation warning
1035+
if woe_columns:
1036+
woe_df = pd.DataFrame(woe_columns, index=X.index)
10311037

10321038
return woe_df
10331039

@@ -1820,8 +1826,6 @@ def _bin_with_faiss_kmeans(
18201826
"""Apply FAISS KMeans clustering to a numerical feature."""
18211827
try:
18221828
import faiss # noqa: F401
1823-
1824-
print("FAISS is available:", faiss)
18251829
except ImportError as e:
18261830
raise ImportError(
18271831
"FAISS is required for faiss_kmeans binning method. "

0 commit comments

Comments
 (0)