LOD & LOQ calculator for mass-spectrometry calibration curves.
Current version: v0.3.0. New default model: PiecewiseCF (closed-form knot search, with speedup over PiecewiseWLS).
loqculate v0.3.0 reimplements and extends the Pino 2020 LOD/LOQ scripts:
-
Closed-form knot search (
PiecewiseCF): fits$y = \max(c, ax + b)$ by exhaustively searching all interior concentration values as candidate partition boundaries. No optimizer, no convergence failures, no initial-guess sensitivity. 15.8× full-pipeline speedup overPiecewiseWLS. Default since v0.3.0. - A sliding-window LOQ rule that reduces FDR from ~100% to <5%: LOQ is declared at
$C_i$ only when$\mathrm{CV}(C_i)$ ,$\mathrm{CV}(C_{i+1})$ , and$\mathrm{CV}(C_{i+2})$ all fall belowcv_thresh. Three consecutive passing concentrations, not one. -
11.4× lower memory per peptide vs the original scripts (VmRSS, 1 000 peptide batch). Batch throughput is 3.3× faster at 10 000 peptides due to reduced GC pressure (measured in
comparison_report.ipynbagainstOriginalWLS) - EmpiricalCV as a model-free alternative to PiecewiseWLS. Fast, but FDR is higher at low replicate counts (n < 5).
-
loqculate.compat: drop-inOriginalWLSandOriginalCVwrappers that reproduce paper results without the legacy scripts. Registered inMODEL_REGISTRYand usable via--model original_wls. - Bootstrap guard preventing infinite loops on zero-signal replicates
- Support for 6 MS input formats: DIA-NN report, DIA-NN pr_matrix, Spectronaut, Skyline, EncyclopeDIA, and generic CSV
- Parallel processing via
ProcessPoolExecutorfor whole-proteome throughput
Install from source (not yet on PyPI):
git clone https://github.com/eneskemalergin/loqculate
cd loqculate
pip install -e ".[dev]"Requires Python ≥ 3.10.
# Fit with default PiecewiseCF model (default since v0.3.0)
loqculate fit data.tsv conc_map.csv
# Fit with PiecewiseWLS model
loqculate fit data.tsv conc_map.csv --model piecewise_wls
# Fit with EmpiricalCV model
loqculate fit data.tsv conc_map.csv --model cv_empirical
# Fit with original Pino 2020 WLS (for paper reproducibility)
loqculate fit data.tsv conc_map.csv --model original_wls
# Fit with original Pino 2020 CV method
loqculate fit data.tsv conc_map.csv --model original_cv
# Compare PiecewiseCF and PiecewiseWLS side-by-side
loqculate compare data.tsv conc_map.csv --models piecewise_cf,piecewise_wlsconc_map.csv maps each raw filename to its calibration concentration:
filename,concentration
sample_1ng_rep1.raw,1.0
sample_1ng_rep2.raw,1.0
sample_5ng_rep1.raw,5.0from loqculate import read_calibration_data, PiecewiseCF, PiecewiseWLS, EmpiricalCV
from loqculate.compat import OriginalWLS, OriginalCV
data = read_calibration_data("data.tsv", "conc_map.csv")
for peptide, x, y in data.iter_peptides():
result = PiecewiseCF().fit(x, y) # default since v0.3.0
print(peptide, result.loq(), result.lod())
# TRF optimizer path
for peptide, x, y in data.iter_peptides():
result = PiecewiseWLS().fit(x, y)
print(peptide, result.loq(), result.lod())
# Reproduce original Pino 2020 results exactly
for peptide, x, y in data.iter_peptides():
m = OriginalWLS().fit(x, y)
print(peptide, m.lod(), m.loq())
# Or via MODEL_REGISTRY
from loqculate import MODEL_REGISTRY
ModelClass = MODEL_REGISTRY['original_wls']| Format | Auto-detected by | Needs conc_map? |
|---|---|---|
| DIA-NN report | File.Name + Precursor.Id columns |
Yes |
| DIA-NN pr_matrix | .pr_matrix.tsv extension |
Yes |
| Spectronaut | R.FileName + PG.ProteinGroups columns |
Yes |
| Skyline | Peptide Sequence + Total Area Fragment columns |
Yes |
| EncyclopeDIA | numFragments column |
Yes |
| Generic CSV | peptide, concentration, area columns |
No (concentration column is embedded) |
conc_map.csv rules:
- Two columns:
filename(basename, no path) andconcentration(numeric, same units as your calibration range). - Filenames that appear in the data but are missing from the map are dropped with a warning. Check your logs if peptide counts appear low.
- All 6 formats except Generic CSV require a map; passing
Noneraises aValueErrorat load time.
Fits a piecewise linear model (noise floor + linear signal) using closed-form weighted least squares with a discrete knot search. Enumerates all interior unique concentration values as candidate partition boundaries and picks the one that minimises total weighted RSS under the constraint set. No optimizer, no initial-guess sensitivity, no convergence tolerance. Constraints (slope >= 0, noise intercept <= linear intercept) are enforced analytically by clamping after solving the normal equations.
Same statistical model as PiecewiseCF. Differs in the solver: uses scipy TRF (curve_fit) with explicit parameter bounds and a legacy initial-guess heuristic (slope from top-two concentration points). Retained for comparison and compatibility. For new analyses, PiecewiseCF is faster (15.8× full pipeline speedup) and finds the globally optimal partition.
Computes CV directly from replicate measurements at each concentration level. LOQ is the lowest concentration where window consecutive CVs fall below cv_thresh. No parametric assumptions. Faster than WLS, but FDR is higher at low replicate counts (n < 5).
cv_thresh (default 0.20). Requiring three consecutive passing concentrations reduces single-point false discoveries from ~100% to <5%.
Verbatim port of old/calculate-loq.py process_peptide(model='piecewise') from Pino 2020. Reproduces the exact original logic: legacy parameter initialisation (slope from top-two concentration points), scipy TRF solver with the same bounds as the original lmfit call, original LOD formula (flat noise floor + SeedSequence(i) for i in range(n_boot), identical to the original script.
Use when you need byte-for-byte reproducibility of published results. For new analyses, prefer PiecewiseCF (15.8× faster full pipeline, globally optimal partitioning).
Verbatim port of old/loq_by_cv.py calculate_LOQ_byCV(). LOQ is the lowest positive concentration where CV ≤ cv_thresh (non-strict ≤, matching the original <= 0.2 check). No LOD, no sliding window, no parametric model. Deterministic and near-instant. Matches EmpiricalCV's speed but FDR is higher due to the single-point rule (64-99% on null curves).
| Parameter | Default | Description |
|---|---|---|
cv_thresh |
0.20 |
CV threshold; LOQ requires CV < this value |
window |
3 |
Consecutive passing concentration points required |
n_boot |
100 |
Bootstrap resamples for LOD/LOQ CI estimation |
std_mult |
2 |
Noise-floor multiplier for LOD ( |
model |
piecewise_cf |
Active model; also accepts piecewise_wls, cv_empirical, original_wls, original_cv
|
Pass these as CLI flags (--cv_thresh 0.15) or keyword arguments to .fit().
Machine: AMD Ryzen 9 3950X, 32 threads. FDR measured on 300 simulated null curves across 4-14 concentration levels (window=3). Throughput measured on real DIA-NN data.
| Metric | PiecewiseCF (v0.3.0) | PiecewiseWLS | EmpiricalCV | OriginalWLS | OriginalCV |
|---|---|---|---|---|---|
| Per-fit, single-threaded | ~0.73 ms | ~1.96 ms | ~0.02 ms | ~1.8 ms | ~0.03 ms |
| Full pipeline vs WLS (n_boot=200) | 15.8× faster | baseline | N/A | N/A | N/A |
| Null FDR, window=3 (range over n_conc 4-14) | 0-2% | 0-2% | 2-18% | 1-2% | 64-99% |
| Memory per peptide (VmRSS) | 0.013 MB | 0.013 MB | ~0 | 0.151 MB | ~0 |
PiecewiseCF replaces the TRF optimizer with a discrete knot search (exhaustive enumeration of interior candidate boundaries). On 27-peptide reference data: 0 convergence failures, globally optimal partition on 13/18 cases where models disagree, and one additional finite-LOD rescue vs PiecewiseWLS. OriginalCV FDR reaches 64-99% due to the single-point rule. Raw benchmark results are in tmp/results/ (JSON); reproduction commands are in benchmarks/.
comparison_report.ipynb is a frozen 12-section validation notebook comparing the original Pino 2020 scripts (OriginalWLS, OriginalCV) to the v0.2.x loqculate models (PiecewiseWLS, EmpiricalCV). It covers FDR and TPR across concentration-level density (4-14 levels) and replicate count (2-20), LOQ accuracy and precision benchmarks, and per-fit speed and throughput scaling up to 10,000 peptides. It does not cover PiecewiseCF or any v0.3.0 changes. A systematic comparison of PiecewiseCF against PiecewiseWLS is tracked in benchmarks/bench_knot_vs_curvefit.py and benchmarks/bench_vectorized_boot.py.
pip install -e ".[dev]"
pre-commit install # install git hooks (runs ruff on every commit)
pytest # 165 passed, 42 skipped
ruff check loqculate/ # lintRun benchmarks (requires ~5 min, tested with 24 threads):
python benchmarks/bench_simulation.py --save tmp/results/bench_simulation.json
python benchmarks/bench_n_concentrations.py --n_workers 24 --save tmp/results/bench_n_concentrations.json
python benchmarks/bench_n_replicates.py --n_workers 24 --save tmp/results/bench_n_replicates.json
python benchmarks/bench_knot_vs_curvefit.py --save tmp/results/bench_knot_vs_curvefit.json
python benchmarks/bench_vectorized_boot.py --save tmp/results/bench_vectorized_boot.jsonA guiding constraint: loqculate is a practical proteomics tool, not a statistics library. Each addition must justify its complexity against the realistic needs of a DIA/PRM lab.
Same piecewise/CV framework, no new model classes. Ends with a pip release after a bug-finding period on real datasets.
- v0.2.2 ✓:
loqculate.compat(OriginalWLS,OriginalCV) wrappers; paper results reproducible without legacy scripts - v0.3.0 ✓: closed-form piecewise solver (
PiecewiseCF). Discrete knot search, no optimizer, 15.8× full-pipeline speedup vs PiecewiseWLS; formalizes$w_i$ vs$W_i$ weight convention;PiecewiseCFis new default - v0.4.0: delta-method uncertainty (
--fastflag, double-dash). Analytical LOQ CIs in$O(1)$ ; smooth models only. Falls back to bootstrap near the piecewise kink (non-differentiable$\max$ ) - v0.5.0: segmented regression (
SegmentedWLS). Continuous piecewise linear with estimated breakpoint$\psi$ , AIC/BIC vs flat-noise piecewise - pip release: after real-data bug-finding marathon. Semantic versioning locked, public API stable for Phase 2
New curve families, model selection, performance, and publication-ready output.
- v0.6.0: 4PL/5PL model (
FourPL):$y = d + (a-d)/[1+(x/c)^b]$ ; LLOQ + ULOQ; FDA bioanalytical guidance compliant; fitted viascipy.optimize.curve_fit(non-linear, unlike the current piecewise linear approach); auto-downgrades to piecewise when upper asymptote is unidentifiable - v0.7.0: model selection (
loqculate.selection): AIC/BIC + approximate LOO-CV (PSIS-LOO, fast) per peptide; auto-recommend with explicit graceful degradation when models fail to converge or AIC penalty blows up on sparse curves - v0.8.0: performance: Numba
@njiton closed-form solver + bootstrap (~3-10×); chunked parquet I/O (pyarrow); shared-memory multiprocessing for 100k+ peptide datasets; adaptive bootstrap early-stopping co-designed with Numba cost model - v1.0.0: stable public API contract. Expanded plotting API (publication-ready diagnostics, waterfall, model-agreement scatter, yield curve); exact LOO-CV and WAIC (requires Bayesian models, deferred to v1.x)
Opt-in, not bundled in core.
- Bayesian piecewise (opt-in extra): MCMC / variational inference via
numpyroorpymc; LOQ from posterior predictive. Scope-limited to targeted panels (≤100 peptides). HMC alone is insufficient for the discrete knot parameter; requires marginalization. -
loqculate-rs(separate package, if needed): Rust bindings for batch computation on very large datasets (500k+ peptides); only warranted if Numba saturation is demonstrated
If you use loqculate in your work, please cite both the original method paper and this software:
Original method (Pino 2020):
Pino, L.K. et al. (2020). Matrix-matched calibration curves for assay characterization in data-independent acquisition proteomics. Analytical Chemistry. https://doi.org/10.1021/acs.analchem.9b04826
This software:
@software{ergin_loqculate_2026,
author = {Ergin, Enes Kemal},
title = {loqculate: Limit of Detection and Quantitation calculator
for mass-spectrometry calibration curves},
year = {2026},
version = {0.3.0},
url = {https://github.com/eneskemalergin/loqculate},
license = {MIT},
}MIT. See LICENSE.
Three points hold the line,
Through the sliding window's gaze,
Noise becomes the truth.