You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* update version for new developmet branch
* Add local testing scripts
* Refactor metrics computation to eliminate np.vectorize overhead and fix double computation bug
* Update Pylint configuration to suppress Cython import errors and ignore benchmark directories
* perf(metrics): optimize Levenshtein DP loop for 7-14% speedup
* perf(metrics): add fast path without word tracking for speedup in wer/wers/werp/werps functions
* docs(changelog): document dual-path architecture, performance improvements, and exception handling fixes
Copy file name to clipboardExpand all lines: CHANGELOG.md
+27Lines changed: 27 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,6 +4,33 @@
4
4
This changelog file outlines a chronologically ordered list of the changes made on this project.
5
5
It is organized by version and release date followed by a list of Enhancements, New Features, Bug Fixes, and/or Breaking Changes.
6
6
7
+
## Version 3.2.0
8
+
9
+
**Released:** December 15, 2025
10
+
**Tag:** v3.2.0
11
+
12
+
### Enhancements
13
+
14
+
- Refactored metrics computation architecture to eliminate `np.vectorize()` overhead by replacing it with a C-level batch processing loop (`_metrics_batch()`). This provides cleaner code structure and establishes a foundation for future performance optimizations without introducing any performance regression.
15
+
16
+
- Fixed double computation bug where `error_handler()` was calling `metrics()` for validation, then wrapper functions were computing metrics again. The `error_handler()` function is now validation-only, and all metric calculations happen through a single unified `metrics()` entry point, improving efficiency and code maintainability.
17
+
18
+
- Standardized internal metrics return format to row-based `(n, 9)` array structure instead of columnar format. This simplifies DataFrame construction in `summary()` and `summaryp()` functions by eliminating complex transpose operations and reducing code complexity.
19
+
20
+
- Improved code organization with unified `metrics()` router function that dispatches to either single-pair `calculations()` or batch `_metrics_batch()` processing, providing a cleaner and more maintainable architecture for metric computation.
21
+
22
+
- Updated Pylint configuration to suppress import errors for Cython modules during static analysis and exclude benchmark/development directories from linting. This resolves CI/CD build failures while maintaining code quality standards for the core package.
23
+
24
+
- Optimized Levenshtein distance algorithm in `calculations()` function with C-level performance improvements: replaced `np.zeros()` with `np.empty()` to eliminate redundant initialization, moved boundary condition initialization outside the main DP loop to remove conditional branches from the hot path, and replaced Python's `min()` function with manual C-level sequential comparisons.
25
+
26
+
- Implemented dual-path architecture with fast path optimization for functions that don't require word tracking. Added three new functions (`calculations_fast()`, `_metrics_batch_fast()`, `metrics_fast()`) that skip word list construction and return float64 arrays instead of object arrays. Updated `wer()`, `wers()`, `werp()`, and `werps()` functions to use the fast path, achieving performance improvement on synthetic benchmarks. Functions requiring word tracking (`summary()` and `summaryp()`) continue using the full path.
27
+
28
+
### Bug Fixes
29
+
30
+
- Expanded try/except scope in all wrapper functions (`wer.py`, `wers.py`, `werp.py`, `werps.py`, `summary.py`, `summaryp.py`) to properly catch exceptions from both validation (`error_handler()`) and computation (`metrics()`/`metrics_fast()`). This fixes 6 pre-existing test failures where invalid input types (e.g., lists of integers) would crash instead of returning None with an error message.
31
+
32
+
- Added division-by-zero guards in `calculations_fast()` function (`wer = (<double>ld) / m if m > 0 else 0.0`) and corpus-level wrapper functions (`wer.py`, `werp.py`) to prevent crashes on empty input. Also added per-row masked division in `werps.py` to handle cases where individual samples have zero reference length.
0 commit comments