Conversation
| cutoff = datetime.utcnow() | ||
| cutoff = cutoff.replace( | ||
| day=cutoff.day - older_than_days | ||
| ) |
There was a problem hiding this comment.
Incorrect date arithmetic causes ValueError in cleanup
Medium Severity
The SQLite backend's cleanup method uses cutoff.replace(day=cutoff.day - older_than_days) to calculate a date cutoff, but datetime.replace() expects a valid day value (1-31). If the current day minus older_than_days results in zero or negative (e.g., January 5th with older_than_days=10 yields -5), Python raises ValueError: day is out of range for month. The correct approach is to use datetime.utcnow() - timedelta(days=older_than_days) for date arithmetic.
| "storage_id": file_id, | ||
| } | ||
| else: | ||
| result["files"][fname] = value |
There was a problem hiding this comment.
Wrong variable assigned in file serialization loop
Medium Severity
In _serialize_tool_output, when iterating over value.items() to get fname, fdata pairs, the else branch incorrectly assigns value (the entire files dictionary) instead of fdata (the individual file's data). This would cause each file entry to contain the entire files dictionary rather than its own content, corrupting the serialized output.
|
|
||
| # === CRITICAL: Set resume loop === | ||
| # Agent._run_agent will start from this loop instead of 1 | ||
| self._resume_from_loop = state.get("current_loop", 0) |
There was a problem hiding this comment.
Agent resume starts from completed loop instead of next
Medium Severity
The RFC example code has an off-by-one error in agent loop resume logic. _resume_from_loop is set to state.get("current_loop", 0), but checkpoints are saved AFTER each loop iteration completes. When resuming, range(start_loop, max_loops + 1) re-executes the already-completed loop. The conversation history already contains that loop's messages (restored from checkpoint), so this would cause duplicate LLM calls and potential message duplication. The fix is state.get("current_loop", 0) + 1 to skip the completed loop.
Additional Locations (1)
| except (TypeError, ValueError): | ||
| # Large or non-serializable results: store truncated | ||
| if isinstance(value, str) and len(value) > 10000: | ||
| serialized[cache_key] = value[:10000] + "...[truncated]" |
There was a problem hiding this comment.
Non-serializable tool cache entries silently dropped during checkpoint
Low Severity
In _serialize_tool_cache, when a tool result value is not JSON-serializable and is not a large string, the entry is silently dropped with no fallback or warning. On resume, these missing cache entries would cause the corresponding tools to be re-executed unnecessarily, contradicting the RFC's goal of using the tool cache to "skip re-executing identical tool calls on resume." Non-serializable results (e.g., custom objects) are simply not added to serialized.
ec64e12 to
d912a41
Compare
d912a41 to
8f08787
Compare
|
would be closed after #566 merge |
Note
Low Risk
Documentation-only change with no runtime or library behavior modifications; risk is limited to potential design misinterpretation until implemented.
Overview
Adds a new documentation-only RFC (
RFC-001) proposing opt-in checkpoint/resume for Dynamiq workflows, including HITL-awarePENDING_INPUTpausing and resume semantics.The RFC is split into multiple detailed docs covering industry research, runtime/HITL integration and API/schema implications, node-by-node state requirements, proposed Pydantic data models/protocols, and pluggable persistence backends (file/SQLite/Redis/Postgres) with retention/cleanup guidance.
Written by Cursor Bugbot for commit 8f08787. This will update automatically on new commits. Configure here.