-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Summary
There is a gap between what CI validates and what users actually run.
The current integration test (pytest -m integration) runs the full extraction pipeline end-to-end in Python, verifying transaction counts, file outputs, and processing summary metrics against a committed snapshot. It is intentionally excluded from the default test run and must be triggered explicitly.
The gap: CI only smoke-tests the Docker image with a Python import check (import bankstatements_free; import bankstatements_core). It never runs the actual processing pipeline inside the container against real PDFs. This means a Docker-specific regression — a broken entrypoint, a missing volume mount, a wrong env var default, or a config that behaves differently inside the container — would pass CI and only be caught by a developer running make docker-local by hand.
What is currently tested
| Layer | What is tested | Where |
|---|---|---|
| Unit | Individual services and classes | pytest (default, 1395 tests) |
| Integration (Python) | Full pipeline against real PDFs, snapshot comparison | pytest -m integration (manual only) |
| Docker (CI) | Image builds, Python imports succeed | ci.yml build-docker job |
| Docker (pipeline) | Not tested | — |
What the gap looks like
A developer making a change to entrypoint.sh, docker-compose.yml, or an env-var default could:
- Break the volume mount for
input/oroutput/ - Set a default that silently changes filter or sort behaviour inside the container
- Introduce a startup error that the import smoke-test does not catch
None of these would fail the current CI pipeline.
Proposed solution
Add a docker-integration CI job (runs after build-docker) that:
- Mounts a small set of test PDFs from
packages/parser-core/tests/integration/fixtures/(or a dedicatedtests/docker/input/directory) into the container - Runs the container to process them
- Asserts the output directory contains the expected files and non-zero transaction counts
This mirrors the existing Python integration test but exercises the real Docker entrypoint, volume mounts, and env-var handling.
Minimal CI step (sketch)
- name: Run Docker integration test
run: |
mkdir -p /tmp/docker-test/input /tmp/docker-test/output
cp packages/parser-core/tests/integration/fixtures/*.pdf /tmp/docker-test/input/
docker run --rm -v /tmp/docker-test/input:/app/input:ro -v /tmp/docker-test/output:/app/output bankstatementsprocessor:pr-${{ github.event.pull_request.number }}
ls /tmp/docker-test/output/*.csv || (echo "No CSV output produced" && exit 1)
python3 -c "
import json, glob, sys
files = glob.glob('/tmp/docker-test/output/*.json')
total = sum(len(json.load(open(f))) for f in files if not f.endswith('_summary.json'))
print(f'Transactions: {total}')
sys.exit(0 if total > 0 else 1)
"Local equivalent (for developers)
# 1. Build the image
make docker-build
# 2. Run against the test fixtures
docker run --rm -v $(pwd)/packages/parser-core/tests/integration/fixtures:/app/input:ro -v /tmp/docker-output:/app/output bankstatementsprocessor:latest
# 3. Inspect output
ls /tmp/docker-output/Value to developers
- Catches entrypoint regressions — a broken
CMDor missingPYTHONPATHshows up immediately - Validates volume mount contract — confirms
/app/inputand/app/outputwork as documented - Exercises env-var defaults —
RECURSIVE_SCAN,COLUMN_NAMES,TABLE_TOP_Yetc. are tested in their default state - Closes the loop between unit tests and what ships — the Docker image is what users actually run; this test validates that layer
Acceptance criteria
- CI runs a Docker integration test on every PR that touches
Dockerfile,entrypoint.sh,docker-compose.yml, orpackages/parser-core/ - Test mounts at least one real PDF, runs the container, and asserts CSV output with non-zero transaction count
-
make docker-integrationtarget added for local use - Test fixtures committed (small, anonymised PDFs) or existing integration fixtures reused
- Job added to the
ci-gaterequired checks