Skip to content

fix: clean denaturant blacklist and fix exp-method regex#4

Draft
tsenoner wants to merge 11 commits intodevelopfrom
fix/blacklist-cleanup
Draft

fix: clean denaturant blacklist and fix exp-method regex#4
tsenoner wants to merge 11 commits intodevelopfrom
fix/blacklist-cleanup

Conversation

@tsenoner
Copy link
Copy Markdown
Collaborator

@tsenoner tsenoner commented Mar 5, 2026

Summary

  • Remove 9 non-denaturant items from strict chemical-denaturants list: reducing agents (BME, 2-ME, mercaptoethanol, DTT, dithiothreitol), NMR reference standard (dss), and NMR buffers (acetic acid, CD3COOH, deuterated sodium acetate)
  • Fix exp-method-blacklist by removing "state" substring from regex list, which incorrectly matched "solution-state" and "liquid-state" entries, excluding valid NMR data

Test plan

  • Verify the strict-tier chemical-denaturants list retains only actual denaturants: guanidin, GdmCl, Gdn-Hcl, urea, TFA, trifluoroethanol, Potassium Pyrophosphate
  • Verify exp-method-blacklist filters solid but not solution-state or liquid-state
  • Run ruff lint and format checks
  • Run full trizod pipeline to confirm no regressions

Addresses expert suggestions #3 and #5.

🤖 Generated with Claude Code

tsenoner and others added 11 commits November 28, 2025 14:48
- Remove duplicate entries (*.txt was listed twice)
- Organize into logical sections with comments
- Add exception for trizod/potenci/data/ directory
- Use proper gitignore patterns (directories with trailing /)
- POTENCI data files are now included as required dependencies
- Add 6 CSV tables extracted from inline strings
- Add comprehensive README documenting data format
- Data sourced from Nielsen & Mulder (2018) POTENCI algorithm

Files added:
- tablecent.csv: Central residue chemical shifts
- tablenei.csv: Neighbor residue corrections
- tabletermcorrs.csv: Terminal corrections
- tabletempk.csv: Temperature coefficients
- tablecombdevs.csv: Combinatorial deviations
- tablephshifts.csv: pH-dependent shifts
- README.md: Comprehensive documentation
- Add comprehensive type hints (ShiftDict, CorrectionDict, etc.)
- Replace unsafe eval() calls with safe float conversion
- Implement CSV-based data loading with caching
- Add PhysicalConstants dataclass
- Remove all backward compatibility wrappers
- Update module docstring with academic references

Security: Eliminates eval() vulnerability
Performance: Cached data loading with lru_cache
Maintainability: Type-safe, well-documented API
- Add comprehensive module docstring
- Fix typo: logging.waring() to logging.warning()
- Update outdated comments (python2.x to python3.10+)
- Replace ##-style comments with proper documentation
- Update to use new constants API (PHYSICAL_CONSTANTS, load_* functions)
- Improve CLI documentation in main()
- Export modern API: load_central_shifts, PHYSICAL_CONSTANTS, etc.
- Remove legacy exports: R, a, b, cutoff, e, ncycles
- Add module docstring
- Update __all__ for clean public API
- Remove setup.py (replaced by pyproject.toml)
- Add uv.lock for reproducible dependencies
- Configure hatchling to include potenci/data files
- Update build system to use modern Python packaging standards

Migration: setup.py → pyproject.toml + uv
Build backend: hatchling
Lock file: uv.lock for reproducibility
- Run ruff check --fix --unsafe-fixes on all modules
- Apply ruff format for consistent code style
- Fix import ordering, comparison operators, nested if statements
- Remove unused variables and imports
- Add explicit exception handling (no bare excepts)
- Rename functions to follow snake_case convention:
  - get_pH → get_ph
  - convChi2CDF → conv_chi2_cdf
  - get_offset_corrected_wSCS → get_offset_corrected_wscs
- Rename exceptions to follow Error suffix convention:
  - Found → FoundError
  - OffsetTooLargeException → OffsetTooLargeError
  - OffsetCausedFilterException → OffsetCausedFilterError
  - FilterException → FilterError
- Fix lambda loop variable binding issue
- Add exception chaining with "from e"
…equires-python

- Replace deprecated np.float with np.float64 (removed in NumPy 1.24+)
- Replace eval() with float() in read_csv_pkaoutput (security fix)
- Fix argument count mismatch in getpredshifts_arr -> getphcorrs_arr call
- Remove dead code: unused log_fun(), no-op str(i+1) statements
- Bump requires-python from >=3.8 to >=3.9 (BooleanOptionalAction, dict |)
- Exclude test/ from ruff config (pre-existing issues, not part of package)
- Add implementation plan document for BMRB expert suggestions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace blanket *.csv/*.txt/*.zip patterns with specific directory
rules for clarity and transparency.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove 9 non-denaturant items (reducing agents, NMR reference standard,
buffers) from strict chemical-denaturants list. Fix exp-method-blacklist
by removing 'state' substring which incorrectly matched 'solution-state'
and 'liquid-state' entries.

Addresses expert suggestions #3 and #5.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@tsenoner tsenoner marked this pull request as draft March 10, 2026 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant