Fix edge cases, add validation, and implement CI by econbernardo · Pull Request #3 · econbernardo/BCaOLSpy

econbernardo · 2026-02-01T08:47:24Z

Bug fixes: Crash on degenerate bootstrap, progress bar count, BCa division by zero

New features: random_state, cov_type, compute_all_bca(), special character support

Quality: Input validation, workflow checks, 42 pytest tests, GitHub Actions CI (Python 3.9–3.12)

When all bootstrap estimates fall on one side of beta_hat, p_star becomes 0 or 1, causing norm.ppf() to return -inf or +inf. This propagated through the BCa/BC formulas and caused np.quantile() to crash with: "ValueError: Quantiles must be in the range [0, 1]" Added explicit check for p_star == 0 or p_star == 1 in both BCa and BC branches, raising a clear ValueError explaining the issue and suggesting to increase n_sim or check for data issues. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The tqdm progress bar was showing "999/999" for n_sim=1000 because the first iteration was done outside the loop. Added total=n_sim and initial=1 parameters to tqdm so it correctly displays "1000/1000" at completion. Applied same fix to jackknife progress bar. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Added validation in __init__: - df: must be a non-empty pandas DataFrame - dependent_var: must be a string present in df columns - independent_vars: must be a non-empty list of strings, all in df columns - alpha: must be a number strictly between 0 and 1 Added validation in perform_bootstrap: - n_sim: must be a positive integer All validations raise TypeError or ValueError with clear messages. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Use block-based ASCII characters " ▖▘▝▗▚▞█" for a smoother progress bar appearance in both bootstrap and jackknife methods. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Added random_state parameter to __init__ that accepts: - None (default): uses a new numpy Generator with random seed - int: seeds a new numpy Generator for reproducibility - numpy.random.Generator: uses the provided generator directly Bootstrap sampling now uses the internal RNG (self._rng) instead of the global numpy random state, enabling reproducible results when the same random_state is provided. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Removed .tolist() conversion in perform_bootstrap() and perform_jackknife(). Distributions are now stored as dictionaries mapping variable names to numpy arrays instead of Python lists. This improves memory efficiency for large n_sim values by avoiding Python object overhead per element. The bca_estimate() method already uses np.asarray() so it works seamlessly with numpy arrays. Updated docstrings to reflect the new return types. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Added RuntimeError checks to ensure proper method call order: - perform_bootstrap() now requires run_regression() to be called first - perform_jackknife() now requires run_regression() to be called first This prevents confusing errors when methods are called out of order and makes the expected workflow explicit. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

New method that computes BCa confidence intervals for all regression coefficients at once, returning a pandas DataFrame with columns: - coef: original OLS coefficient - bias_corrected: bias-corrected coefficient - ci_low: lower bound of confidence interval - ci_high: upper bound of confidence interval Includes workflow validation (requires run_regression, perform_bootstrap, and perform_jackknife to be called first). Supports all CI types (BCa, BC, perc). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Added cov_type parameter to __init__ that accepts 'HC0', 'HC1', 'HC2', or 'HC3' for different heteroscedasticity-consistent covariance estimators. Default remains 'HC0' for backward compatibility. HC1-HC3 provide small-sample corrections that typically result in wider confidence intervals, which can be more appropriate for smaller datasets. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The BCa formula computes alpha values using: alpha = norm.cdf(z0 + (z + z0) / (1 - ahat * (z + z0))) When ahat * (z + z0) equals 1, division by zero occurs. This can happen with extremely skewed jackknife distributions (|ahat| ≈ 0.51). Added explicit check for zero denominators with a clear error message suggesting to use CI_type='BC' or 'perc' as alternatives. Also refactored to compute denominators once and reuse them. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Variable names containing special characters (spaces, operators, parentheses, etc.) could break the statsmodels formula parser or cause unexpected behavior. Added quote_if_needed() helper that wraps variable names in Q() only when they contain problematic characters. Normal variable names remain unquoted for backward compatibility. Supported special characters: space, +, -, *, /, (, ), [, ], {, }, :, ~, ^ Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add comprehensive test suite with 42 tests covering: - Input validation (15 tests) - Workflow enforcement (5 tests) - Basic functionality (5 tests) - compute_all_bca method (4 tests) - Reproducibility with random_state (3 tests) - cov_type parameter (2 tests) - Edge cases (4 tests) - Special characters in variable names (4 tests) - Add GitHub Actions workflow that: - Runs on push/PR to main - Tests Python 3.9, 3.10, 3.11, 3.12 - Generates coverage reports - Add pyproject.toml with pytest configuration Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

econbernardo · 2026-02-01T08:48:18Z

waiting for checks to pass

econbernardo · 2026-02-01T21:13:32Z

Go over examples before merging

econbernardo and others added 13 commits January 31, 2026 23:51

Add custom ASCII characters for tqdm progress bar

a3cc6f8

Use block-based ASCII characters " ▖▘▝▗▚▞█" for a smoother progress bar appearance in both bootstrap and jackknife methods. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Update .gitignore

deaf349

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix edge cases, add validation, and implement CI#3

Fix edge cases, add validation, and implement CI#3
econbernardo wants to merge 13 commits intomainfrom
fix/edge-cases-and-validation

econbernardo commented Feb 1, 2026

Uh oh!

econbernardo commented Feb 1, 2026

Uh oh!

econbernardo commented Feb 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

econbernardo commented Feb 1, 2026

Uh oh!

econbernardo commented Feb 1, 2026

Uh oh!

econbernardo commented Feb 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant