Read CLAUDE.md fully before starting. Then read
engineering/design/SDD-008-analyse-stage.md completely — the full
implementation is specified there including all code samples.
A new pipeline stage: src/pipeline/analyse/
This stage sits after separate and extracts per-stem visualisation curves
from the separated waveforms. Each stem produces a StemCurves object with
six normalised time-series curves that the visual layer drives animations from.
src/pipeline/analyse/
__init__.py — exports StemCurves, AnalyseError, analyse, CurvesSource
curves.py — StemCurves dataclass + AnalyseError
analyse.py — analyse() function (main implementation)
source.py — CurvesSource ABC
disk.py — DiskCurvesSource
memory.py — MemoryCurvesSource
tests/pipeline/
test_analyse.py — full test suite (TC-ANA-001 through TC-ANA-012)
The full implementation is in SDD-008 §8. Follow it exactly. Key points:
No ML dependencies — this stage uses only librosa, numpy, and scipy.
All imports are safe at module level (no TYPE_CHECKING guard needed).
scipy needs adding to pyproject.toml — add "scipy>=1.10" to the
base dependencies list (not [ml]). It's Python 3.13 compatible.
_apply_envelope is the only loop — everything else is vectorised numpy.
The envelope loop is unavoidable (each frame depends on the previous frame)
but runs fast enough on 100fps × 30s = 3000 frames.
StemCurves.at_fps() — implements decimation (stride slicing), not
interpolation. Upsampling is Q-ANA-2 and out of scope.
DiskCurvesSource — files go in test_audio/<slug>/curves/ with the
naming convention <stem>_<curve>.npy and <stem>_meta.json. The save()
method creates the directory if it doesn't exist.
MemoryCurvesSource — simple dict wrapper. save() stores in memory,
load() retrieves, exists() checks the dict, available_stems() returns
the keys.
All tests use synthetic stems — np.zeros, np.ones, or sine waves.
No audio files, no Demucs, no Basic Pitch. The suite must run in both
.venv and .venv-ml with 0 failures and 0 errors.
The @requires_pretty_midi and @requires_demucs patterns are NOT needed
here — scipy and librosa are in base deps.
For TC-ANA-009 (DiskCurvesSource round-trip), use tmp_path pytest fixture.
After the analyse stage is implemented, update scripts/run-stems.sh to
run analysis after separation and write curves to disk. Add to the
on_ready callback:
from src.pipeline.analyse.analyse import analyse
from src.pipeline.analyse.disk import DiskCurvesSource
curves_dir = Path(f"test_audio/{name}/curves")
source = DiskCurvesSource(curves_dir)
# analyse stems as they arrive
stem_curves = analyse(stems)
for stem_name, curves in stem_curves.items():
source.save(stem_name, curves)
print(f" ✓ curves: {stem_name}")-
engineering/README.md— add SDD-008 row:| [SDD-008](./design/SDD-008-analyse-stage.md) | Analyse — per-stem curve extraction | ✅ Written | -
engineering/PROJECT-CONTEXT.md— add SDD-008 to document inventory -
CLAUDE.md— add analyse stage to build state table:| src/pipeline/analyse/ | ✅ Complete | 12 tests passing | -
pyproject.toml— add"scipy>=1.10"to base dependencies
./scripts/test-all.shpasses with 0 failures in both environmentssrc/pipeline/analyse/__init__.pyexports:StemCurves,AnalyseError,analyse,CurvesSource,DiskCurvesSource,MemoryCurvesSource- All six curves in every
StemCurvesare float32, shape(T,), values in[0.0, 1.0] DiskCurvesSourceround-trip test passesrun-stems.shwrites curves to disk after separation- All housekeeping completed
- No regressions in existing tests
Do not touch src/api/ or anything outside the analyse stage and scripts.