Milestone: M3 (ADO.NET + Online DDL enablement) Status: Updated v0.2
- Deliver In-Place Online DDL (IPOD) that allows schema evolution without full table copy while keeping read workloads and the single-writer pipeline active.
- Guarantee crash safety by sequencing data journal replay before schema log replay and requiring idempotent deltas.
- Support DDL verbs:
CREATE TABLE,ALTER TABLE ADD/DROP/RENAME/MODIFY COLUMN,DROP TABLE,CREATE INDEX,DROP INDEX. - Apply to Phase A storage formats (DBF/DBT with NTX/MDX indexes); design extensible for Phase B (FPT/CDX).
- Append-only UTF-8 or binary log stored alongside tables:
<table>.ddl. - Entry shape:
{Version, Timestamp, Author, Operation, Payload, Checksum}with monotonicVersion(uint64). - Indexed by
Version; truncated only after checkpoint consolidates up toVersion. - Stored under journal directory for transactional durability: write to temp → fsync → rename.
TableCatalogtracksActiveVersionandPreviousVersiondescriptors.- Cursors request a target schema version; if data row precedes the version, adapters pad defaults / mark tombstoned columns.
- Readers pinned to snapshot version to avoid shape changes mid-read; writers use latest version.
- Queue maintained per table for rows requiring reshape (e.g., new column default, dropped column tombstone).
- Triggered by write path (on read-modify-write) and background worker invoked by tooling.
- Bounded by throttles (max rows per second) and yields to transaction commit.
ddl checkpointcommand acquires short exclusive DDL lock, flushes journal, applies pending deltas to header, rewrites schema catalog, and compacts.ddllog up toCheckpointVersion.- Optionally runs
packif flagged, ensuring deleted rows removed only after schema stabilization.
- Distinct from write lock; only held during header rewrite, index swap rename, or checkpoint consolidation.
- Enforced via OS advisory lock on
<table>.ddl.lockto avoid interfering with standard read/write operations.
- Index create/modify builds new NTX/MDX under temp suffix (e.g.,
.new), validated via checksum/scan. - Swap performed atomically via rename while holding DDL lock; rollback removes temp artifact.
- Validate schema (field types, lengths, LDID) and allocate initial header.
- Emit
.ddlentryCreateTablewith versionv1capturing schema + memo/index metadata. - Apply header write under DDL lock, create empty DBF/DBT, initialize journal &
.ddllog. - Post-create checkpoint marks base version and clears staging artifacts.
- ADD COLUMN
- Append
.ddlentry describing column name, type, default, nullability, placement. - Update catalog version; new writes emit column value; backfill queue enqueues existing rows with default value lazily.
- Checkpoint rewrites header to include column once majority backfill done.
- Append
- DROP COLUMN
- Emit
.ddlentry marking column tombstoned; update catalog to hide column from new projections. - Backfill queue truncates column data lazily by writing tombstone markers; checkpoint rewrites header without column and
optionally triggers
pack.
- Emit
- RENAME COLUMN
.ddlentry mapsOldName → NewName; projections provide alias view instantly.- Header rename deferred to checkpoint to reduce lock window; indexes referencing column updated during swap.
- MODIFY COLUMN (type/length change within compatible family)
- Validate compatibility (e.g.,
C(20) → C(40),N(8,2) → N(10,2)). .ddlentry records transformation function; adapters coerce reads; backfill rewrites rows via lazy process.- Incompatible changes rejected; require copy-rebuild tool.
- Validate compatibility (e.g.,
.ddlentryDropTablecreated, marking table as pending drop.- Acquire DDL lock, ensure no active transactions, fsync journal, rename artifacts to trash directory.
- Update catalog removing table; recovery honors drop by ignoring table files beyond drop version.
.ddlentryCreateIndexcapturing tag name, expression, order, key length, filter.- Build index side-by-side; track progress in journal for crash recovery.
- On completion, acquire DDL lock and atomically rename
.newfile; checkpoint records final metadata.
.ddlentryDropIndexappended.- Acquire DDL lock briefly to unlink index file; mark as inactive immediately in catalog.
- Recovery replays entry to ensure stray files removed.
- Recovery order: data journal → schema
.ddllog → backfill queue resume. - Backfill tasks persisted in journal to avoid replay divergence.
- Lock escalation: if backfill lags beyond threshold, throttle new writes or request operator checkpoint.
- Maximum DDL lock hold time target: < 250 ms for ALTER operations; < 2 s for checkpoint (with memo/index fsync) (configurable via
--lock-timeout; increase for large tables or slow storage as needed). For large tables or slower storage, consider scaling the timeout proportionally to table size and I/O performance to avoid premature lock timeouts. - Detect conflicting DDL by comparing
CurrentVersionbefore appending new entry; providers retry with updated metadata.
- Extend parser to accept
CREATE TABLE,ALTER TABLE,DROP TABLE,CREATE INDEX,DROP INDEXstatements mapped toISchemaMutatorAPI. DbCommandexecutes DDL under implicit transaction, emitting schema version results (e.g.,SchemaVersionoutput parameter).GetSchemaincludes columns:SchemaVersion,PendingBackfill,LastCheckpoint.
- Update migrations pipeline to emit
.ddldeltas instead of copy-rebuild for supported mutations. - Migrations annotate operations with expected
PreviousVersion; runtime compares before applying to detect drift. - Provide extension method
UseXBaseOnlineMigrations()toggling IPOD path; fallback to copy-rebuild when operation unsupported. - Track pending backfill metrics surfaced via
IDiagnosticsLoggerfor application awareness.
xbase ddl apply <path>: stream.ddlscript(s) into table logs with validation, dry-run mode, and batch checkpoint option.xbase ddl checkpoint <table>: force checkpoint, optionally--packto combine with record compaction.xbase ddl pack <table>: run pack with schema awareness, ensuring drop-column tombstones cleared before header rewrite.- Tool commands honor
--lock-timeout,--throttle, and--resumeflags for long-running operations. - CLI emits structured progress events for integration with CI or operations dashboards.
- Unit Tests: schema log serialization/deserialization, version arithmetic, conflict detection.
- Integration Tests: run ALTER scenarios with concurrent readers and ensure consistent projections; simulate crash mid-DDL and verify recovery completes without data loss.
- Performance Tests: measure DDL lock durations, backfill throughput, index swap latency.
- Acceptance Criteria:
- All FR-OD-* requirements satisfied with automated coverage.
- EF Core migration applying ADD COLUMN with concurrent reads passes without blocking longer than target thresholds.
- Tooling commands produce idempotent results across reruns; checkpoint reduces
.ddllog size as expected. - Documentation updated (ARCHITECTURE.md, requirements.md, this file) and referenced in ROADMAP milestone M3.
End of DDL.md