Skip to content

refactor: add runInTransaction for atomic multi-step DB operations#85

Merged
dean0x merged 9 commits intomainfrom
feat/transaction-atomicity-81
Mar 13, 2026
Merged

refactor: add runInTransaction for atomic multi-step DB operations#85
dean0x merged 9 commits intomainfrom
feat/transaction-atomicity-81

Conversation

@dean0x
Copy link
Owner

@dean0x dean0x commented Mar 12, 2026

Summary

  • Add Database.runInTransaction() exposing better-sqlite3's synchronous transaction semantics
  • Add SyncTaskOperations / SyncScheduleOperations interfaces with sync method variants (saveSync, updateSync, findByIdSync, recordExecutionSync)
  • Refactor schedule handler: both handleSingleTaskTrigger and handlePipelineTrigger now wrap all DB mutations in a single atomic transaction — partial failures roll back cleanly instead of leaving orphaned rows
  • Remove broken TaskRepository.transaction() and TransactionTaskRepository (superseded by runInTransaction)
  • Fix orphaned handler-setup.test.ts (missing dependency registrations, now included in test:services)

Closes #81

Key behavior change

Pipeline partial save failure: previously left N saved-then-cancelled tasks in the DB. Now leaves 0 tasks (transaction rollback). No orphaned rows, no manual cleanup loop.

Follow-up items (tracked separately)

Test plan

  • npm run build — TypeScript compiles clean
  • npm run test:implementations — 310 tests (new: runInTransaction, sync methods)
  • npm run test:handlers — 122 tests (new: atomicity, rollback behavior)
  • npm run test:services — 152 tests (new: handler-setup with database dep)
  • npm run test:repositories — 127 tests
  • npm run test:core — 367 tests
  • npm run test:adapters — 55 tests
  • npm run test:cli — 150 tests
  • All 1,283 tests passing, zero regressions

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 12, 2026

Confidence Score: 4/5

  • Safe to merge — no data-integrity regressions; two minor issues (discarded cron error in logs, missing null check) do not affect correctness of the core atomicity behaviour.
  • The fundamental goal — atomic multi-step DB writes via runInTransaction — is correctly implemented and thoroughly tested with rollback verification. The previously flagged issues (optional database param, misleading error prefix, double-wrapped errors, pre-commit logging) are all valid but pre-addressed in review threads. Two remaining findings: (1) computeScheduleUpdates discards the BackbeatError from getNextRunTime, so logScheduleTransition can only emit a generic warning without the root cause when a cron schedule is paused — an observability regression; (2) dependency-repository.ts line 264 has no null guard on the post-insert row lookup, inconsistent with the identical fix the PR applies in schedule-repository.ts. Neither is a runtime blocker, but both reduce diagnosability.
  • src/services/handlers/schedule-handler.ts (discarded getNextRunTime error) and src/implementations/dependency-repository.ts (missing null check on post-insert row lookup)

Important Files Changed

Filename Overview
src/implementations/database.ts Adds runInTransaction<T>(fn: () => T): Result<T> implementing TransactionRunner; uses better-sqlite3's synchronous db.transaction(fn)() and correctly preserves BackbeatError while wrapping all other errors as SYSTEM_ERROR.
src/implementations/task-repository.ts Adds saveSync, findByIdSync, updateSync methods and a shared toDbFormat helper. Also fixes `
src/implementations/schedule-repository.ts Adds findByIdSync, updateSync, recordExecutionSync sync methods and a shared toDbFormat helper. Correctly adds null checks on post-insert row lookup in both the new recordExecutionSync and the async recordExecution. Minor inconsistency: the analogous row lookup in dependency-repository.ts was not updated.
src/implementations/dependency-repository.ts Migrates addDependencies from manual db.transaction(fn)(args) + custom error handler to database.runInTransaction(); all semantic BackbeatError checks (task existence, duplicate detection, limit checks) run inside the transaction for true TOCTOU safety. Missing: null check on the post-insert getDependencyByIdStmt.get(rowid) lookup (inconsistent with same PR's fix in schedule-repository.ts).
src/services/handlers/schedule-handler.ts Wraps both handleSingleTaskTrigger and handlePipelineTrigger DB mutations in a single runInTransaction call, achieving true atomicity. Also fixes old getNextRunTime(expression, triggeredAt, timezone) wrong-argument-order call. The computeScheduleUpdates refactor into a pure function is well-done, but discards the nextResult.error from a failed cron calculation so logScheduleTransition cannot include the root cause in its warning log.
src/services/handler-setup.ts Adds database: TransactionRunner as a required dependency in HandlerDependencies and its extraction from the DI container; passes it through to ScheduleHandler.create(). Increments dependency count comment from 11 to 12.
tests/unit/implementations/database.test.ts Adds four runInTransaction tests: success path, generic error wrapping, BackbeatError preservation, and full rollback verification. Good coverage of all three codepaths in the new method.
tests/unit/services/handlers/schedule-handler.test.ts Updates pipeline rollback test from "2 tasks cancelled" to "0 tasks exist" (reflecting transaction semantics), adds tests for recordExecutionSync failure rollback and full pipeline atomicity commit. Spy pattern updated from savesaveSync to match the new sync API.
tests/unit/services/handler-setup.test.ts Adds missing database, scheduleRepository, and checkpointRepository registrations that caused the test to be orphaned; adds test for missing-database error. Now included in test:services run.

Sequence Diagram

sequenceDiagram
    participant EB as EventBus
    participant SH as ScheduleHandler
    participant DB as Database.runInTransaction
    participant TR as TaskRepository (sync)
    participant SR as ScheduleRepository (sync)

    EB->>SH: ScheduleTriggered
    SH->>SH: resolveAfterScheduleTaskId() [async, outside tx]
    SH->>SH: createTask() [pure, outside tx]
    SH->>SH: computeScheduleUpdates() [pure, outside tx]

    SH->>DB: runInTransaction(() => { ... })
    activate DB
    DB->>TR: saveSync(task)
    DB->>SR: recordExecutionSync(execution)
    DB->>SR: updateSync(scheduleId, updates, existing)
    Note over DB: COMMIT (all 3 writes atomic)
    DB-->>SH: Result<void>
    deactivate DB

    alt txResult.ok
        SH->>SH: logScheduleTransition()
        SH->>EB: emit TaskDelegated
        SH->>EB: emit ScheduleExecuted
    else txResult.err (ROLLBACK)
        SH->>SR: recordFailedExecution() [best-effort, outside tx]
    end
Loading

Comments Outside Diff (2)

  1. src/services/handlers/schedule-handler.ts, line 556-558 (link)

    getNextRunTime error swallowed — root cause lost on PAUSED transition

    When getNextRunTime fails (invalid cron expression, timezone issue, overflow, etc.), computeScheduleUpdates sets newStatus = ScheduleStatus.PAUSED but discards nextResult.error entirely — the pure function has no way to surface it. logScheduleTransition then fires a generic warn with only the cron expression, giving operators no indication of why the calculation failed.

    The old updateScheduleAfterTrigger passed nextResult.error directly to the logger:

    this.logger.error('Failed to calculate next run, pausing schedule', nextResult.error, { scheduleId, cronExpression: ... });
    

    The BackbeatError returned by getNextRunTime carries the raw cron-parser message in its context.error field — exactly the text needed to diagnose invalid expressions or timezone edge cases. That detail is now permanently gone from the audit trail.

    One option is to have computeScheduleUpdates return an extended object that includes the optional error alongside the Partial<Schedule> updates, then let logScheduleTransition log it at error severity when present:

    // Return type becomes { updates: Partial<Schedule>; cronError?: BackbeatError }
    // logScheduleTransition:
    if (updates.status === ScheduleStatus.PAUSED) {
      this.logger.error('Failed to calculate next run, pausing schedule', cronError, {
        scheduleId: schedule.id,
        cronExpression: schedule.cronExpression,
      });
    }
  2. src/implementations/dependency-repository.ts, line 264-265 (link)

    Missing null check — inconsistent with same PR's fix in schedule-repository.ts

    This PR explicitly added null guards to the analogous post-insert lookup patterns in schedule-repository.ts (both recordExecutionSync and the async recordExecution), but this loop was not updated to match. If getDependencyByIdStmt.get(result.lastInsertRowid) ever returns undefined (e.g., lastInsertRowid is 0), the unsafe cast silently produces undefined as DependencyRow, and the subsequent DependencyRowSchema.parse(undefined) call inside rowToDependency throws a ZodError. Inside runInTransaction, that ZodError is caught and wrapped as the opaque "Transaction failed: ..." message rather than a clear diagnostic error.

    Applying the same pattern used in schedule-repository.ts:

    for (const depId of dependsOn) {
      const result = this.addDependencyStmt.run(taskId, depId, createdAt);
      const row = this.getDependencyByIdStmt.get(result.lastInsertRowid) as DependencyRow | undefined;
      if (!row) {
        throw new BackbeatError(
          ErrorCode.SYSTEM_ERROR,
          `Failed to retrieve dependency record after insert (rowid: ${result.lastInsertRowid})`,
        );
      }
      createdDependencies.push(this.rowToDependency(row));
    }

Last reviewed commit: bfb5e93

TaskRepository,
} from '../../core/interfaces.js';
import { err, ok, Result } from '../../core/result.js';
import { Database } from '../../implementations/database.js';
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 BLOCKING: Concrete Database type in service layer (DIP violation)

The ScheduleHandler depends directly on the concrete Database class instead of an interface. This violates Dependency Inversion Principle and couples the service layer to SQLite/better-sqlite3.

Fix: Extract a TransactionRunner interface in src/core/interfaces.ts:

export interface TransactionRunner {
  runInTransaction<T>(fn: () => T): Result<T>;
}

Then change ScheduleHandler and HandlerDependencies to depend on TransactionRunner instead of Database.

See architecture review for details.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 1056034 — extracted TransactionRunner interface in src/core/interfaces.ts. Database implements TransactionRunner. ScheduleHandler and handler-setup now depend on the abstraction.

status: task.status,
priority: task.priority,
workingDirectory: task.workingDirectory || null,
timeout: task.timeout || null,
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ SHOULD-FIX: Use nullish coalescing (??) instead of OR (||) for numeric fields

The toDbFormat method uses task.timeout || null which treats 0 as falsy. A timeout of 0 should be valid (immediately time out), but will be stored as null.

Fix:

timeout: task.timeout ?? null,
maxOutputBuffer: task.maxOutputBuffer ?? null,

This is pre-existing logic that was refactored into toDbFormat, but the refactoring is a good opportunity to fix it.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 501f952 — all numeric fields (timeout, maxOutputBuffer, retryCount) now use nullish coalescing (??) instead of OR (||).

*/
private async updateScheduleAfterTrigger(schedule: Schedule, triggeredAt: number): Promise<Result<void>> {
const scheduleId = schedule.id;
private computeScheduleUpdates(schedule: Schedule, triggeredAt: number): Partial<Schedule> {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ SHOULD-FIX: Misleading JSDoc - computeScheduleUpdates has side effects

The JSDoc claims "Pure computation — no side effects" but the method calls this.logger.error() and this.logger.info() in three places (lines 562, 577, 588).

Fix: Update the comment to reflect reality:

/**
 * Compute schedule update fields after a trigger (runCount, lastRunAt, nextRunAt, status).
 * Logs status changes but has no database side effects. Shared by async and sync trigger paths.
 */

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 1056034 — extracted all logging from computeScheduleUpdates into logScheduleTransition() which is called post-commit. JSDoc now accurately reads "Pure computation — no side effects."

* @param options - Optional configuration
* @returns Result containing initialized handler or error
*/
static async create(
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 BLOCKING: Factory method parameter ordering breaks convention

The ScheduleHandler.create() method places database after logger, breaking the established pattern used in other handlers. The convention is (repos..., eventBus, logger, options?).

Fix: Reorder parameters to match convention:

static create(
  scheduleRepo: ScheduleRepository & SyncScheduleOperations,
  taskRepo: TaskRepository & SyncTaskOperations,
  eventBus: TaskEventEmitter,
  database: Database,
  logger: TaskLogger,
  options?: ScheduleHandlerOptions,
): Result<ScheduleHandler>

The constructor already has this order; just match it in create().

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 1056034 — factory create() params reordered to match constructor: (repos, eventBus, database, logger, options?). Callers in handler-setup.ts and tests updated.

return this.rowToSchedule(row);
}

updateSync(id: ScheduleId, update: Partial<Schedule>): void {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ SHOULD-FIX: Redundant read-before-write inside transaction

The updateSync method calls findByIdSync to read the full schedule row, then updates it. But inside handleSingleTaskTrigger and handlePipelineTrigger, the schedule was already fetched at the top of the handler.

Fix: Consider adding an updateFieldsSync(id, mergedSchedule) method that skips the read when the caller already has the full object. This is an optimization opportunity, not blocking.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 1056034updateSync now accepts optional existing?: Schedule parameter. Both trigger paths pass the already-fetched schedule, skipping the redundant findByIdSync call inside the transaction.

@dean0x
Copy link
Owner Author

dean0x commented Mar 12, 2026

Summary of Code Review Findings

Comprehensive review across 8 reporters (security, architecture, performance, complexity, consistency, regression, tests, typescript) identified 7 blocking and 6 should-fix issues. Inline comments created for specific file:line findings.

BLOCKING Issues Requiring Action

  1. 🔴 DIP Violation (Architecture) — src/services/handlers/schedule-handler.ts:38

    • ScheduleHandler depends on concrete Database class, not interface
    • Fix: Extract TransactionRunner interface and depend on abstraction
  2. 🔴 Parameter Ordering Convention (Consistency) — src/services/handlers/schedule-handler.ts:72

    • Factory method breaks established parameter order convention
    • Fix: Reorder to: repos, eventBus, database, logger, options
  3. 🔴 Dead Code (Consistency/TypeScript/Regression) — src/services/handlers/schedule-handler.ts:526-545, 602-614

    • recordTriggeredExecution() (lines 526-545) is never called
    • updateScheduleAfterTrigger() (lines 602-614) is never called
    • Fix: Remove both methods entirely

SHOULD-FIX Issues

  1. ⚠️ Numeric Field Coalescing (Security) — src/implementations/task-repository.ts:180-181

    • Use ?? instead of || for timeout/maxOutputBuffer to preserve 0 values
  2. ⚠️ Misleading Documentation (TypeScript) — src/services/handlers/schedule-handler.ts:551

    • JSDoc claims "pure computation" but method has logging side effects
    • Fix: Update comment to reflect reality
  3. ⚠️ Redundant Database Read (Performance) — src/implementations/schedule-repository.ts:324

    • updateSync reads full schedule before updating, though caller already has it
    • Consider adding updateFieldsSync variant for callers with full object

Review Scores

  • Architecture: 8/10 — DIP violation is the main concern
  • Consistency: 7/10 — Dead code and parameter ordering issues
  • Security: 9/10 — Overall sound, one medium issue worth fixing
  • Performance: 8/10 — Net improvement despite some micro-optimization opportunities

Recommendation

CHANGES_REQUESTED — Address blocking issues (DIP, parameter ordering, dead code) before merge. Should-fix items are recommended but lower priority.


Generated by Claude Code review-pr operation | 8 review reports analyzed

@dean0x dean0x mentioned this pull request Mar 12, 2026
10 tasks
Dean Sharon and others added 7 commits March 13, 2026 16:10
Replace manual cleanup loops with database-level transaction guarantees.
Schedule handler trigger methods now wrap task saves, execution records,
and schedule updates in a single atomic transaction — partial failures
roll back cleanly instead of leaving orphaned rows.

- Add Database.runInTransaction() exposing better-sqlite3 transactions
- Add SyncTaskOperations / SyncScheduleOperations interfaces
- Add sync method variants (saveSync, updateSync, findByIdSync, recordExecutionSync)
- Extract toDbFormat() helpers in both repositories (DRY async + sync)
- Refactor handleSingleTaskTrigger: 3-step sequence → single transaction
- Refactor handlePipelineTrigger: N saves + cleanup loop → atomic transaction
- Remove broken TaskRepository.transaction() + TransactionTaskRepository
- Fix handler-setup.test.ts: add missing dependency registrations, include in test:services
- 14 files changed, 680 insertions, 317 deletions
- Make `database` parameter required in ScheduleHandler.create() (was
  optional with runtime guard — now caught at compile time)
- Change error prefix from "Failed to create task:" to "Schedule trigger
  failed:" since the transaction can fail at any step
Replace || with ?? for timeout and maxOutputBuffer in both toDbFormat()
and rowToTask() so that a value of 0 is preserved instead of being
treated as falsy. Matches the existing ?? pattern used for exitCode.

Co-Authored-By: Claude <noreply@anthropic.com>
Remove two private methods with zero callers that were superseded by
the sync transaction path: recordTriggeredExecution() and
updateScheduleAfterTrigger(). Fix computeScheduleUpdates() JSDoc that
incorrectly claimed "no side effects" when the method performs logging.

Co-Authored-By: Claude <noreply@anthropic.com>
Unify transaction patterns — DependencyRepository.addDependencies() now
uses the shared runInTransaction() helper instead of raw db.transaction()
with a manual tryCatch wrapper, completing the #81 migration.
retryCount is numeric — || incorrectly treats 0 as falsy.
@dean0x dean0x force-pushed the feat/transaction-atomicity-81 branch from 4e2d9d5 to 3b53a89 Compare March 13, 2026 14:10
Dean Sharon added 2 commits March 14, 2026 01:07
…ull guards

- Extract TransactionRunner interface so handlers depend on abstraction, not
  concrete Database (fixes DIP violation)
- Align factory create() param order with constructor (database before logger)
- Add optional `existing` param to updateSync to skip redundant DB read
- Add null guards on post-insert row lookup in recordExecutionSync (sync+async)
- Move logging out of computeScheduleUpdates into post-commit logScheduleTransition
- Add inner try-catch in single-task transaction to prevent double-wrapped errors
- Fix freshDb test to use repos bound to the same database instance
@dean0x dean0x merged commit fc53226 into main Mar 13, 2026
2 checks passed
@dean0x dean0x deleted the feat/transaction-atomicity-81 branch March 13, 2026 23:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

refactor: add runInTransaction for atomic multi-step DB operations

1 participant