Releases: SpeyTech/certifiable-data
Releases · SpeyTech/certifiable-data
Deterministic data pipeline for safety-critical ML systems.
v1.0.0 — Initial Release
Deterministic data pipeline for safety-critical ML systems.
Highlights
This release delivers a complete deterministic data pipeline with 8/8 test suites passing (142 tests). Every data transformation produces bit-identical results across platforms, with cryptographic audit trails for certification evidence.
Core Modules
| Module | Description | Tests |
|---|---|---|
| DVM Primitives | Q16.16 fixed-point arithmetic with fault detection | ✅ |
| PRNG | Counter-based deterministic pseudo-random generation | ✅ |
| Feistel Shuffle | Cycle-walking bijection for any dataset size | ✅ |
| Normalization | Q16.16 standardization with (x - mean) * inv_std | ✅ |
| Augmentation | Deterministic flip, crop, noise transformations | ✅ |
| Batch Construction | Static allocation with Merkle commitment | ✅ |
| Merkle Chain | SHA256 provenance trail per epoch | ✅ |
| Bit Identity | Cross-platform reproducibility verification | ✅ |
Total: 142 tests passing
Key Properties
- Bit-perfect determinism — Same seed → same result, every platform
- Zero dynamic allocation — All buffers statically allocated
- Deterministic shuffling — Feistel permutation with test vectors
- Merkle provenance — Every epoch cryptographically committed
- Fault detection — Overflow, underflow, domain errors tracked
- Pure C99 — No platform-specific dependencies
Quick Start
git clone https://github.com/williamofai/certifiable-data.git
cd certifiable-data
mkdir build && cd build
cmake ..
make
make testExpected output:
100% tests passed, 0 tests failed out of 8
Total Test time (real) = 0.04 sec
Compliance
Designed for certification under:
- DO-178C (Aerospace)
- IEC 62304 (Medical devices)
- ISO 26262 (Automotive)
- IEC 61508 (Industrial safety)
Related Projects
| Project | Description | Demo |
|---|---|---|
| certifiable-inference | Deterministic inference engine | inference.speytech.com |
| certifiable-training | Deterministic training engine | training.speytech.com |
| certifiable-data | Deterministic data pipeline | — |
Together they provide a complete deterministic ML pipeline from data loading → training → inference.
Documentation
- CT-MATH-001.md — Mathematical foundations
- CT-STRUCT-001.md — Data structure specifications
- docs/requirements/ — SRS documents (SRS-001 through SRS-006)
Built by SpeyTech in the Scottish Highlands.
Patent: UK GB2521625.0 — Murray Deterministic Computing Platform (MDCP)
For commercial licensing: william@fstopify.com