Skip to content

feat: support Snakemake SQLite DB persistence backend#51

Merged
nh13 merged 1 commit intomainfrom
nh/snakemake-db-persistence
Mar 28, 2026
Merged

feat: support Snakemake SQLite DB persistence backend#51
nh13 merged 1 commit intomainfrom
nh/snakemake-db-persistence

Conversation

@nh13
Copy link
Copy Markdown
Collaborator

@nh13 nh13 commented Mar 28, 2026

Summary

Adds support for reading workflow metadata from Snakemake's new SQLite persistence backend (.snakemake/metadata.db) in addition to the existing filesystem-based backend.

  • New snakesee/persistence/ module with PersistenceBackend protocol
  • FsPersistence wraps existing filesystem metadata/incomplete/lock reading
  • DbPersistence reads from SQLite with read-only access and namespace filtering
  • detect_backend() auto-selects based on whether metadata.db exists (prefers DB)
  • Wired into HistoricalDataLoader, parse_workflow_state, is_workflow_running, and TUI
  • 42 new tests across 4 test files; all 1020 tests passing

Closes #40

Test plan

  • Unit tests pass: uv run pytest tests/ -v
  • All checks pass: uv run poe check-all
  • Manual test with a Snakemake 9+ workflow that uses DB persistence
  • Manual test with a legacy filesystem-based workflow (no regression)

Summary by CodeRabbit

  • New Features

    • Added SQLite-backed metadata support and automatic backend detection; estimator now reads historical timing from the detected backend.
    • Workflow paths now recognize a metadata DB alongside filesystem metadata.
  • Bug Fixes / Improvements

    • More robust workflow-running and incomplete-job detection using the persistence layer.
    • Progress/loading adapts correctly between DB and filesystem metadata sources.
  • Tests

    • Comprehensive tests added for both persistence backends, detection logic, iteration, and state parsing.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 28, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b687c160-8122-49e1-9896-dc409b0b43df

📥 Commits

Reviewing files that changed from the base of the PR and between 7794905 and 3cce606.

📒 Files selected for processing (18)
  • snakesee/estimation/data_loader.py
  • snakesee/estimation/estimator.py
  • snakesee/parser/core.py
  • snakesee/parser/metadata.py
  • snakesee/persistence/__init__.py
  • snakesee/persistence/backend.py
  • snakesee/persistence/db.py
  • snakesee/persistence/fs.py
  • snakesee/state/paths.py
  • snakesee/tui/monitor.py
  • tests/conftest.py
  • tests/db_helpers.py
  • tests/test_estimation_components.py
  • tests/test_parser.py
  • tests/test_persistence_db.py
  • tests/test_persistence_detection.py
  • tests/test_persistence_fs.py
  • tests/test_state_paths.py
✅ Files skipped from review due to trivial changes (7)
  • snakesee/persistence/init.py
  • tests/db_helpers.py
  • tests/test_estimation_components.py
  • tests/test_persistence_detection.py
  • tests/test_persistence_fs.py
  • snakesee/persistence/backend.py
  • tests/test_persistence_db.py
🚧 Files skipped from review as they are similar to previous changes (6)
  • tests/test_state_paths.py
  • snakesee/estimation/estimator.py
  • snakesee/estimation/data_loader.py
  • snakesee/persistence/fs.py
  • snakesee/tui/monitor.py
  • snakesee/persistence/db.py

📝 Walkthrough

Walkthrough

Adds a pluggable persistence layer to read Snakemake metadata from either .snakemake/metadata.db (SQLite) or .snakemake/metadata/ (JSON). Introduces a PersistenceBackend protocol, DbPersistence and FsPersistence implementations, backend detection, and integrates backends into metadata loading, parser running-state checks, and the estimator.

Changes

Cohort / File(s) Summary
Persistence Abstraction & Exports
snakesee/persistence/backend.py, snakesee/persistence/__init__.py
Add PersistenceBackend Protocol, IncompleteJob dataclass, and detect_backend() factory; re-export DbPersistence, FsPersistence, IncompleteJob, PersistenceBackend, detect_backend.
SQLite Backend
snakesee/persistence/db.py
New DbPersistence read-only SQLite backend: connect to metadata.db, iterate completed metadata (with progress), iterate incomplete jobs (with optional start-time filter), detect locks, parse rows to MetadataRecord, handle DB errors.
Filesystem Backend
snakesee/persistence/fs.py
New FsPersistence filesystem backend: iterate JSON metadata via parser, detect locks/incomplete by scanning .snakemake/ dirs, decode base64 marker filenames for incomplete jobs.
Workflow Paths
snakesee/state/paths.py
Add METADATA_DB_NAME, WorkflowPaths.metadata_db property, and WorkflowPaths.has_metadata_db cached existence check.
Estimation Integration
snakesee/estimation/data_loader.py, snakesee/estimation/estimator.py
Add HistoricalDataLoader._consume_metadata_records() helper and load_from_backend(...); add TimeEstimator.load_from_backend(...) to delegate loading to HistoricalDataLoader and update code_hash_to_rules.
Parser / Running-State
snakesee/parser/core.py
Add optional backend parameter to is_workflow_running() to use backend has_locks()/has_incomplete_jobs(); parse_workflow_state() auto-detects backend and uses iterate_incomplete_jobs() when appropriate.
Metadata Input Size Utility
snakesee/parser/metadata.py
Rename _calculate_input_size()calculate_metadata_input_size() and use it when computing input_size during metadata parsing.
TUI Integration
snakesee/tui/monitor.py
Detect backend and switch estimator metadata loading to Estimator.load_from_backend(); separate FS vs DB presence for progress reporting (DB-only uses indeterminate total).
Tests & Fixtures
tests/... (conftest.py, test_persistence_db.py, test_persistence_fs.py, test_persistence_detection.py, test_estimation_components.py, test_parser.py, test_state_paths.py, db_helpers.py)
Add SQLite-backed fixture, extensive tests for Db/Fs persistence behavior, detection logic, parser integration, estimator backend loading, and WorkflowPaths DB properties.

Sequence Diagram(s)

sequenceDiagram
    participant TUI as TUI Monitor
    participant Detector as detect_backend()
    participant Backend as PersistenceBackend
    participant Loader as HistoricalDataLoader
    participant Estimator as TimeEstimator
    participant Registry as RuleRegistry

    TUI->>Detector: detect_backend(workflow_dir)
    alt metadata.db exists
        Detector->>Backend: DbPersistence(paths)
    else
        Detector->>Backend: FsPersistence(paths)
    end
    Detector-->>TUI: backend

    TUI->>Estimator: load_from_backend(backend)
    Estimator->>Loader: loader.load_from_backend(backend, progress_cb)
    Loader->>Backend: iterate_metadata(progress_cb)
    Backend-->>Loader: MetadataRecord stream
    Loader->>Registry: record completions & build code_hash_to_rules
    Loader-->>Estimator: code_hash_to_rules assigned
    Estimator-->>TUI: metadata loaded
Loading
sequenceDiagram
    participant Parser as parse_workflow_state()
    participant Detector as detect_backend()
    participant Backend as PersistenceBackend
    participant IsRunning as is_workflow_running()

    Parser->>Detector: detect_backend(workflow_dir)
    Detector-->>Parser: backend

    Parser->>IsRunning: is_workflow_running(snakemake_dir, backend)
    IsRunning->>Backend: has_locks()
    Backend-->>IsRunning: bool
    IsRunning->>Backend: has_incomplete_jobs()
    Backend-->>IsRunning: bool
    IsRunning-->>Parser: running status

    alt is_latest_log and backend exists
        Parser->>Backend: iterate_incomplete_jobs(min_start_time)
        Backend-->>Parser: IncompleteJob stream
        Parser->>Parser: map to JobInfo objects
    else
        Parser->>Parser: fallback filesystem checks
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 I hopped through metadata, both DB and file,

I decoded markers and hashed code with a smile,
Rules tallied, durations counted true,
From logs and rows I gathered the view,
A tiny rabbit, learning every file.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: support Snakemake SQLite DB persistence backend' clearly and specifically describes the main feature addition, matching the primary changes to add SQLite database support.
Description check ✅ Passed The PR description includes a clear summary of changes, mentions issue closure (#40), provides test instructions, and follows good practice; however, the template sections (Type of Change, Checklist) are not explicitly filled out.
Linked Issues check ✅ Passed The PR fully implements the objective from issue #40: adds support for reading Snakemake's SQLite persistence backend (.snakemake/metadata.db), implements PersistenceBackend protocol with DbPersistence and FsPersistence, includes detect_backend() factory, and integrates throughout the codebase.
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing SQLite persistence backend support: persistence module abstraction, two backend implementations, integration points (loader, parser, TUI), and comprehensive tests. No extraneous changes detected.
Docstring Coverage ✅ Passed Docstring coverage is 87.37% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch nh/snakemake-db-persistence

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 28, 2026

Codecov Report

❌ Patch coverage is 88.25911% with 29 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.59%. Comparing base (b0e3a49) to head (3cce606).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
snakesee/persistence/fs.py 82.14% 10 Missing ⚠️
snakesee/parser/core.py 76.00% 6 Missing ⚠️
snakesee/persistence/db.py 94.39% 6 Missing ⚠️
snakesee/persistence/backend.py 84.37% 5 Missing ⚠️
snakesee/estimation/data_loader.py 87.50% 1 Missing ⚠️
snakesee/estimation/estimator.py 80.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #51      +/-   ##
==========================================
+ Coverage   88.47%   88.59%   +0.11%     
==========================================
  Files          48       52       +4     
  Lines        4684     4926     +242     
==========================================
+ Hits         4144     4364     +220     
- Misses        540      562      +22     
Files with missing lines Coverage Δ
snakesee/parser/metadata.py 96.20% <100.00%> (+8.86%) ⬆️
snakesee/persistence/__init__.py 100.00% <100.00%> (ø)
snakesee/state/paths.py 97.74% <100.00%> (+0.09%) ⬆️
snakesee/estimation/data_loader.py 86.56% <87.50%> (-0.10%) ⬇️
snakesee/estimation/estimator.py 88.61% <80.00%> (-0.16%) ⬇️
snakesee/persistence/backend.py 84.37% <84.37%> (ø)
snakesee/parser/core.py 82.44% <76.00%> (-0.40%) ⬇️
snakesee/persistence/db.py 94.39% <94.39%> (ø)
snakesee/persistence/fs.py 82.14% <82.14%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (3)
tests/test_state_paths.py (1)

453-456: Consider adding clear_exists_cache() for test isolation.

This test doesn't clear the existence cache before checking has_metadata_db. If a previous test (in the same pytest session) cached a True result for a path that happens to match this tmp_path, the test could produce incorrect results. While tmp_path provides unique directories, adding the cache clear would be consistent with test_has_metadata_db_true_when_exists and ensure isolation.

♻️ Suggested fix
     def test_has_metadata_db_false_when_missing(self, tmp_path: Path) -> None:
         """Test has_metadata_db is False when file doesn't exist."""
+        from snakesee.state.paths import clear_exists_cache
+
+        clear_exists_cache()
         paths = WorkflowPaths(tmp_path)
         assert paths.has_metadata_db is False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_state_paths.py` around lines 453 - 456, The test
test_has_metadata_db_false_when_missing should clear the existence cache to
ensure isolation: call clear_exists_cache() before constructing
WorkflowPaths(tmp_path) so the cached result from other tests won't affect
paths.has_metadata_db; locate the test function and add a clear_exists_cache()
invocation prior to creating the WorkflowPaths instance (reference:
clear_exists_cache, WorkflowPaths, has_metadata_db,
test_has_metadata_db_false_when_missing).
tests/test_parser.py (1)

2241-2251: Consider extracting shared schema creation into a helper.

The CREATE TABLE snakemake_metadata and CREATE TABLE snakemake_locks DDL is duplicated in both tests here and in tests/conftest.py::metadata_db and tests/test_persistence_db.py. Consider extracting this to a shared helper or constant to reduce duplication and ensure schema consistency.

Also applies to: 2282-2292

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_parser.py` around lines 2241 - 2251, Duplicate DDL for tables
snakemake_metadata and snakemake_locks should be extracted into a single test
helper or constant (e.g., METADATA_SCHEMA_SQL or a function
create_metadata_schema(conn)) and referenced from all locations
(tests/test_parser.py, tests/conftest.py::metadata_db,
tests/test_persistence_db.py) so schema is defined once; implement the helper
that returns or executes the two CREATE TABLE statements and replace the inline
SQL in functions/fixtures that currently call conn.execute(...) with a call to
that helper.
snakesee/estimation/data_loader.py (1)

80-115: Extract the shared metadata-record loop.

load_from_backend() now mirrors load_from_metadata() almost line-for-line. Any future change to duration gating, wildcard handling, or code-hash grouping will need to be kept in sync manually. A small private helper that consumes an iterator of MetadataRecords would keep both code paths aligned.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@snakesee/estimation/data_loader.py` around lines 80 - 115, The
load_from_backend implementation duplicates the metadata-record loop logic
present in load_from_metadata; extract that shared logic into a private helper
(e.g., _consume_metadata_records or _process_metadata_iterator) that accepts an
iterator of MetadataRecord and an optional progress_callback, moves the
duration/end_time gating and wildcard handling, calls
self._registry.record_completion(rule, duration, timestamp, wildcards,
input_size) and accumulates code_hash -> set(rule) mapping, and then have both
load_from_backend and load_from_metadata call this helper and assign its
returned code_hash_to_rules to self.code_hash_to_rules.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@snakesee/persistence/backend.py`:
- Around line 92-97: detect_backend currently relies on
WorkflowPaths.has_metadata_db which uses a global exists cache and can return
stale results; replace that cached check with an immediate filesystem probe so
the function prefers the DB if it appears during a live session. Concretely, in
detect_backend (where you create paths = WorkflowPaths(workflow_dir)) stop using
paths.has_metadata_db and instead perform a direct existence check (e.g.
os.path.exists or pathlib.Path(...) .exists()) against the actual metadata.db
file path exposed by WorkflowPaths (use the attribute holding the metadata DB
path on paths), and return DbPersistence(paths) when that immediate check is
true; otherwise fall back to FsPersistence as before. Ensure you import os or
pathlib as needed.

In `@snakesee/persistence/db.py`:
- Around line 229-238: Replace the bare "except Exception" around the
orjson.loads/calculate_input_size block with targeted exception handlers: catch
orjson.JSONDecodeError for malformed JSON and TypeError (or ValueError if
applicable) for bad input to calculate_input_size, and log the errors using the
project's standard logging pattern (see snakesee/events.py and
snakesee/utils.py) before continuing; update the code around input_json,
input_files, calculate_input_size and the orjson import to reflect these
specific catches and preserve existing control flow.
- Around line 50-52: The code builds the SQLite URI by interpolating db_path
directly (uri = f"file:{db_path}?mode=ro"), which fails for paths containing
characters like ? or #; change it to percent-encode the filesystem path using
pathlib.Path.as_uri(), e.g. construct uri = Path(db_path).resolve().as_uri() +
"?mode=ro" and then call sqlite3.connect(uri, uri=True, timeout=10.0); update
the import to include pathlib.Path if needed and keep the existing variable
names (db_path, uri) and sqlite3.connect call.

In `@snakesee/persistence/fs.py`:
- Around line 55-63: The has_incomplete_jobs method currently treats any entry
in self._paths.incomplete_dir as an incomplete job; update it to mirror
iterate_incomplete_jobs() filtering: on OSError return False as before, but
otherwise iterate inc_dir.iterdir(), skip entries that are directories and skip
the special marker filename (e.g. "migration_underway" or the same constant used
by iterate_incomplete_jobs()), and return True only if any remaining entry
qualifies as an incomplete job; keep the same function name has_incomplete_jobs
and use self._paths.incomplete_dir and the same marker identifier as
iterate_incomplete_jobs().

In `@snakesee/tui/monitor.py`:
- Around line 396-407: The progress bar total is currently set from the
filesystem even when the DB backend (has_metadata_db) is used, which can make
the bar finish too early or overshoot; update the logic around progress.add_task
so the total is only provided when the filesystem backend is being used (i.e.,
has_metadata_fs is true AND has_metadata_db is false), otherwise pass None to
keep the progress bar indeterminate; adjust the total calculation that currently
uses metadata_dir.rglob()/file_count and the call to progress.add_task("Loading
metadata...", total=...) accordingly (refer to has_metadata, has_metadata_fs,
has_metadata_db, metadata_dir and the progress.add_task invocation).

---

Nitpick comments:
In `@snakesee/estimation/data_loader.py`:
- Around line 80-115: The load_from_backend implementation duplicates the
metadata-record loop logic present in load_from_metadata; extract that shared
logic into a private helper (e.g., _consume_metadata_records or
_process_metadata_iterator) that accepts an iterator of MetadataRecord and an
optional progress_callback, moves the duration/end_time gating and wildcard
handling, calls self._registry.record_completion(rule, duration, timestamp,
wildcards, input_size) and accumulates code_hash -> set(rule) mapping, and then
have both load_from_backend and load_from_metadata call this helper and assign
its returned code_hash_to_rules to self.code_hash_to_rules.

In `@tests/test_parser.py`:
- Around line 2241-2251: Duplicate DDL for tables snakemake_metadata and
snakemake_locks should be extracted into a single test helper or constant (e.g.,
METADATA_SCHEMA_SQL or a function create_metadata_schema(conn)) and referenced
from all locations (tests/test_parser.py, tests/conftest.py::metadata_db,
tests/test_persistence_db.py) so schema is defined once; implement the helper
that returns or executes the two CREATE TABLE statements and replace the inline
SQL in functions/fixtures that currently call conn.execute(...) with a call to
that helper.

In `@tests/test_state_paths.py`:
- Around line 453-456: The test test_has_metadata_db_false_when_missing should
clear the existence cache to ensure isolation: call clear_exists_cache() before
constructing WorkflowPaths(tmp_path) so the cached result from other tests won't
affect paths.has_metadata_db; locate the test function and add a
clear_exists_cache() invocation prior to creating the WorkflowPaths instance
(reference: clear_exists_cache, WorkflowPaths, has_metadata_db,
test_has_metadata_db_false_when_missing).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 712a47a8-7270-4fd4-9d65-f74897672464

📥 Commits

Reviewing files that changed from the base of the PR and between 684359e and 1c08545.

📒 Files selected for processing (17)
  • snakesee/estimation/data_loader.py
  • snakesee/estimation/estimator.py
  • snakesee/parser/core.py
  • snakesee/parser/metadata.py
  • snakesee/persistence/__init__.py
  • snakesee/persistence/backend.py
  • snakesee/persistence/db.py
  • snakesee/persistence/fs.py
  • snakesee/state/paths.py
  • snakesee/tui/monitor.py
  • tests/conftest.py
  • tests/test_estimation_components.py
  • tests/test_parser.py
  • tests/test_persistence_db.py
  • tests/test_persistence_detection.py
  • tests/test_persistence_fs.py
  • tests/test_state_paths.py

Comment thread snakesee/persistence/backend.py
Comment thread snakesee/persistence/db.py
Comment thread snakesee/persistence/db.py Outdated
Comment thread snakesee/persistence/fs.py
Comment thread snakesee/tui/monitor.py
@nh13 nh13 force-pushed the nh/snakemake-db-persistence branch from 1c08545 to 7794905 Compare March 28, 2026 15:51
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
snakesee/parser/metadata.py (1)

67-85: Keep this helper name metadata-specific.

This rename creates two public calculate_input_size() helpers with different contracts: snakesee.parser.utils.calculate_input_size(list[Path]) is already re-exported from snakesee.parser, while this one accepts raw metadata strings. That makes the wrong import easy to pick and harder to type-check. Keeping this private or renaming it to something like calculate_metadata_input_size() would avoid the collision.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@snakesee/parser/metadata.py` around lines 67 - 85, The helper
calculate_input_size in snakesee.parser.metadata conflicts with the public
utils.calculate_input_size (different contract), so rename this function to
calculate_metadata_input_size (or prefix with an underscore) and update all
internal callers in this module to use the new name; ensure it is not exported
from snakesee.parser (remove from any __all__ or re-exports) so imports won’t
accidentally pick the wrong helper and type-checking remains correct.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@snakesee/persistence/db.py`:
- Around line 179-186: The query in the db code that runs when min_start_time is
not None is treating rows with starttime IS NULL as always recent, which lets
stale incomplete stubs mark the workflow INCOMPLETE; update the SQL in the
conn.execute call that queries snakemake_metadata (the branch guarded by
min_start_time) to exclude NULL starttimes (e.g., replace the `(starttime IS
NULL OR starttime >= ?)` predicate with `starttime >= ?`) so only entries with a
real starttime >= min_start_time are considered; ensure this change is applied
where the code uses self._namespace and the same query that feeds
_determine_final_workflow_status().

---

Nitpick comments:
In `@snakesee/parser/metadata.py`:
- Around line 67-85: The helper calculate_input_size in snakesee.parser.metadata
conflicts with the public utils.calculate_input_size (different contract), so
rename this function to calculate_metadata_input_size (or prefix with an
underscore) and update all internal callers in this module to use the new name;
ensure it is not exported from snakesee.parser (remove from any __all__ or
re-exports) so imports won’t accidentally pick the wrong helper and
type-checking remains correct.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 386a2c03-ea9e-4c80-ad0d-2fac07df01c9

📥 Commits

Reviewing files that changed from the base of the PR and between 1c08545 and 7794905.

📒 Files selected for processing (15)
  • snakesee/estimation/data_loader.py
  • snakesee/estimation/estimator.py
  • snakesee/parser/core.py
  • snakesee/parser/metadata.py
  • snakesee/persistence/__init__.py
  • snakesee/persistence/backend.py
  • snakesee/persistence/db.py
  • snakesee/persistence/fs.py
  • snakesee/tui/monitor.py
  • tests/conftest.py
  • tests/test_estimation_components.py
  • tests/test_parser.py
  • tests/test_persistence_db.py
  • tests/test_persistence_detection.py
  • tests/test_persistence_fs.py
✅ Files skipped from review due to trivial changes (3)
  • tests/test_parser.py
  • snakesee/persistence/backend.py
  • tests/test_estimation_components.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • snakesee/persistence/init.py
  • snakesee/estimation/estimator.py
  • tests/test_persistence_detection.py
  • tests/test_persistence_fs.py

Comment thread snakesee/persistence/db.py
@nh13 nh13 force-pushed the nh/snakemake-db-persistence branch from 7794905 to be89b00 Compare March 28, 2026 21:31
Adds support for reading workflow metadata from Snakemake's new SQLite
persistence backend (.snakemake/metadata.db) in addition to the existing
filesystem-based backend.

- New snakesee/persistence/ module with PersistenceBackend protocol
- FsPersistence wraps existing filesystem metadata/incomplete/lock reading
- DbPersistence reads from SQLite with read-only access and namespace filtering
- detect_backend() auto-selects based on whether metadata.db exists (prefers DB)
- Wired into HistoricalDataLoader, parse_workflow_state, is_workflow_running,
  and TUI metadata loading
- 42 new tests across 4 test files

Closes #40
@nh13 nh13 force-pushed the nh/snakemake-db-persistence branch from be89b00 to 3cce606 Compare March 28, 2026 21:32
@nh13 nh13 merged commit 4053236 into main Mar 28, 2026
8 checks passed
@nh13 nh13 deleted the nh/snakemake-db-persistence branch March 28, 2026 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for new Snakemake DB persistence backend

1 participant