feat: add rfc for checkpoints by vitalii-dynamiq · Pull Request #513 · dynamiq-ai/dynamiq

vitalii-dynamiq · 2026-01-07T12:28:46Z

Note

Low Risk
Documentation-only change with no runtime or library behavior modifications; risk is limited to potential design misinterpretation until implemented.

Overview
Adds a new documentation-only RFC (RFC-001) proposing opt-in checkpoint/resume for Dynamiq workflows, including HITL-aware PENDING_INPUT pausing and resume semantics.

The RFC is split into multiple detailed docs covering industry research, runtime/HITL integration and API/schema implications, node-by-node state requirements, proposed Pydantic data models/protocols, and pluggable persistence backends (file/SQLite/Redis/Postgres) with retention/cleanup guidance.

^{Written by Cursor Bugbot for commit 8f08787. This will update automatically on new commits. Configure here.}

cursor · 2026-01-07T12:34:24Z

docs/rfc-001-checkpoint-resume/06-STORAGE-BACKENDS.md

+                    cutoff = datetime.utcnow()
+                    cutoff = cutoff.replace(
+                        day=cutoff.day - older_than_days
+                    )


Incorrect date arithmetic causes ValueError in cleanup

Medium Severity

The SQLite backend's cleanup method uses cutoff.replace(day=cutoff.day - older_than_days) to calculate a date cutoff, but datetime.replace() expects a valid day value (1-31). If the current day minus older_than_days results in zero or negative (e.g., January 5th with older_than_days=10 yields -5), Python raises ValueError: day is out of range for month. The correct approach is to use datetime.utcnow() - timedelta(days=older_than_days) for date arithmetic.

cursor · 2026-01-07T12:34:24Z

docs/rfc-001-checkpoint-resume/04-NODE-ANALYSIS.md

+                            "storage_id": file_id,
+                        }
+                else:
+                    result["files"][fname] = value


Wrong variable assigned in file serialization loop

Medium Severity

In _serialize_tool_output, when iterating over value.items() to get fname, fdata pairs, the else branch incorrectly assigns value (the entire files dictionary) instead of fdata (the individual file's data). This would cause each file entry to contain the entire files dictionary rather than its own content, corrupting the serialized output.

cursor · 2026-01-14T09:47:06Z

docs/rfc-001-checkpoint-resume/04-NODE-ANALYSIS.md

+
+        # === CRITICAL: Set resume loop ===
+        # Agent._run_agent will start from this loop instead of 1
+        self._resume_from_loop = state.get("current_loop", 0)


Agent resume starts from completed loop instead of next

Medium Severity

The RFC example code has an off-by-one error in agent loop resume logic. _resume_from_loop is set to state.get("current_loop", 0), but checkpoints are saved AFTER each loop iteration completes. When resuming, range(start_loop, max_loops + 1) re-executes the already-completed loop. The conversation history already contains that loop's messages (restored from checkpoint), so this would cause duplicate LLM calls and potential message duplication. The fix is state.get("current_loop", 0) + 1 to skip the completed loop.

Additional Locations (1)

docs/rfc-001-checkpoint-resume/04-NODE-ANALYSIS.md#L362-L365

cursor · 2026-01-14T09:47:06Z

docs/rfc-001-checkpoint-resume/04-NODE-ANALYSIS.md

+            except (TypeError, ValueError):
+                # Large or non-serializable results: store truncated
+                if isinstance(value, str) and len(value) > 10000:
+                    serialized[cache_key] = value[:10000] + "...[truncated]"


Non-serializable tool cache entries silently dropped during checkpoint

Low Severity

In _serialize_tool_cache, when a tool result value is not JSON-serializable and is not a large string, the entry is silently dropped with no fallback or warning. On resume, these missing cache entries would cause the corresponding tools to be re-executed unnecessarily, contradicting the RFC's goal of using the tool cache to "skip re-executing identical tool calls on resume." Non-serializable results (e.g., custom objects) are simply not added to serialized.

docs/rfc-001-checkpoint-resume/04-NODE-ANALYSIS.md

github-actions · 2026-01-14T09:50:40Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
TOTAL	23616	7384	68%

report-only-changed-files is enabled. No files were changed during this commit :)

Tests	Skipped	Failures	Errors	Time
1288	38 💤	0 ❌	0 🔥	9m 22s ⏱️

acoola · 2026-03-05T18:07:51Z

would be closed after #566 merge

vitalii-dynamiq requested a review from a team as a code owner January 7, 2026 12:28

cursor bot reviewed Jan 7, 2026

View reviewed changes

cursor bot reviewed Jan 14, 2026

View reviewed changes

acoola force-pushed the add-checkpoints branch from ec64e12 to d912a41 Compare January 20, 2026 18:50

feat: add rfc for checkpoints

8f08787

acoola force-pushed the add-checkpoints branch from d912a41 to 8f08787 Compare February 17, 2026 10:15

acoola added the on hold work currently on hold label Mar 5, 2026

vitalii-dynamiq closed this Mar 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add rfc for checkpoints#513

feat: add rfc for checkpoints#513
vitalii-dynamiq wants to merge 1 commit intomainfrom
add-checkpoints

vitalii-dynamiq commented Jan 7, 2026 •

edited by cursor bot

Loading

Uh oh!

cursor bot Jan 7, 2026

Uh oh!

cursor bot Jan 7, 2026

Uh oh!

cursor bot Jan 14, 2026

Uh oh!

cursor bot Jan 14, 2026

Uh oh!

Uh oh!

github-actions bot commented Jan 14, 2026 •

edited

Loading

Uh oh!

acoola commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vitalii-dynamiq commented Jan 7, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor bot Jan 7, 2026

Choose a reason for hiding this comment

Incorrect date arithmetic causes ValueError in cleanup

Uh oh!

cursor bot Jan 7, 2026

Choose a reason for hiding this comment

Wrong variable assigned in file serialization loop

Uh oh!

cursor bot Jan 14, 2026

Choose a reason for hiding this comment

Agent resume starts from completed loop instead of next

Uh oh!

cursor bot Jan 14, 2026

Choose a reason for hiding this comment

Non-serializable tool cache entries silently dropped during checkpoint

Uh oh!

Uh oh!

github-actions bot commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

acoola commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vitalii-dynamiq commented Jan 7, 2026 •

edited by cursor bot

Loading

github-actions bot commented Jan 14, 2026 •

edited

Loading