Add graph-level webhooks for execution failure and completion #607

Brijesh-Thakkar · 2025-12-16T07:53:32Z

This PR adds support for graph-level webhooks to notify external systems
about graph execution lifecycle events.

Key changes:

Introduces WebhookConfig model for graph templates
Dispatches STATE_EXECUTED webhook on successful execution
Dispatches GRAPH_FAILED webhook when a graph fails and no retries remain
Webhooks are executed via BackgroundTasks to avoid blocking request flow

This improves observability and enables external orchestration.

Fixes #445

coderabbitai · 2025-12-16T07:53:42Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Summary by CodeRabbit

New Features
- Graphs can include optional webhook configurations to receive GRAPH_FAILED notifications with error context.
- Failed-state webhooks are scheduled as background tasks so notifications don't block processing.
- Execution now better handles states that produce multiple outputs, creating subsequent states as needed.
Chores
- Added a resilient, best-effort webhook dispatcher for sending JSON notifications.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Walkthrough

Adds webhook support: new WebhookConfig model, optional webhook field on GraphTemplate, a best-effort async webhook dispatcher, and conditional scheduling of a GRAPH_FAILED webhook from the errored_state controller via BackgroundTasks when appropriate.

Changes

Cohort / File(s)	Change Summary
Webhook Configuration Models `state-manager/app/models/webhook_config_model.py`, `state-manager/app/models/db/graph_template_model.py`	New `WebhookConfig` Pydantic model (`url`, `events`, `headers`). `GraphTemplate` gains optional `webhook: WebhookConfig
Error State Webhook Dispatch `state-manager/app/controller/errored_state.py`	`errored_state` signature adds optional `background_tasks: BackgroundTasks
Webhook Dispatch Utility `state-manager/app/tasks/webhook.py`	New async `dispatch_webhook(*, url, payload, headers)` that POSTs JSON with a 5s timeout using `httpx`, logs warnings on exceptions, and swallows errors (best-effort).
Output State Creation Refactoring `state-manager/app/controller/executed_state.py`	Reworked multi-output handling: explicit `next_state_ids: list[PydanticObjectId]`, separate branches for zero vs non-zero outputs, build `new_states` list and batch `insert_many`, then delegate next-state processing as a background task; logging/returns made multiline and explicit.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Controller as errored_state\n(Controller)
    participant DB as Database\n(GraphTemplate)
    participant BG as Background\nTasks
    participant Webhook as Webhook\nEndpoint

    Client->>Controller: POST error notification
    Controller->>DB: Load GraphTemplate (check `webhook`)
    DB-->>Controller: GraphTemplate with/without webhook

    alt webhook configured AND "GRAPH_FAILED" in events AND no retry created
        Controller->>BG: Schedule dispatch_webhook(payload, headers)
        Controller-->>Client: Return ErroredResponseModel
        BG->>Webhook: POST JSON payload (5s timeout)
        alt 2xx
            Webhook-->>BG: Success
            Note over BG: Delivery logged
        else error/timeout
            Webhook-->>BG: Error
            Note over BG: Warning logged, error swallowed
        end
    else otherwise
        Controller-->>Client: Return ErroredResponseModel (no webhook)
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Pay attention to: conditional logic in errored_state.py (retry check, event filtering, payload contents, timestamp correctness).
Verify dispatch_webhook timeout, exception handling, and header propagation.
Review executed_state.py batch insert correctness and next_state_ids aggregation.
Check WebhookConfig field validation and that GraphTemplate usage sites handle the new optional field.

Possibly related PRs

Fixes #231 — overlapping changes in executed_state.py multi-output handling and state initialization.
Adding background task to create next state #175 — related background task usage for next-state processing in executed_state.py.
Fixing Parent ID #235 — similar adjustments to how child/next states are constructed in executed_state.py.

Suggested reviewers

nk-ag

"I hopped through logs at dawn,
A webhook bell I lightly drawn,
When graphs fall from their tree,
I whisper what I see —
A carrot-coded signal carries on!" 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: adding graph-level webhooks for execution failure and completion events.
Description check	✅ Passed	The description is directly related to the changeset, explaining webhook support for graph execution lifecycle events with specific implementation details.
Linked Issues check	✅ Passed	The PR fully addresses issue #445 by implementing WebhookConfig model, dispatching webhooks on execution events, and supporting configurable webhooks at graph level.
Out of Scope Changes check	✅ Passed	All changes are in scope: webhook model and infrastructure in executed/errored state handlers align with the objective to enable configurable graph-level webhook notifications.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a82709e and 125ef16.

📒 Files selected for processing (1)

state-manager/app/controller/executed_state.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

state-manager/app/controller/executed_state.py (4)

state-manager/app/models/executed_models.py (2)

ExecutedRequestModel (5-6)

ExecutedResponseModel (9-10)

state-manager/app/models/db/state.py (2)

State (13-106)

insert_many (57-63)

state-manager/app/models/state_status_enum.py (1)

StateStatusEnum (4-20)

state-manager/app/tasks/create_next_states.py (1)

create_next_states (89-250)

🔇 Additional comments (7)

state-manager/app/controller/executed_state.py (7)

1-10: Imports are clean and appropriate.

All imports are used, and the previously flagged unused datetime import has been removed.

13-19: Function signature is well-structured.

The explicit typing and required BackgroundTasks parameter align with FastAPI best practices. The return type annotation provides clear contract.

26-37: State validation is correct.

The check for state existence and status validation use appropriate HTTP status codes. The defensive not state.id check provides extra safety.

39-47: Zero outputs handling is correct.

When no outputs are provided, the state is properly marked as EXECUTED with empty outputs and persisted before being added to the next state IDs list.

81-89: Background task scheduling is correctly implemented.

The create_next_states invocation matches the expected signature and properly delegates state processing without blocking the HTTP response.

91-93: Return statement is correct.

The response properly uses ExecutedResponseModel with the appropriate status enum.

95-101: Exception handling provides good observability.

Logging errors with the request ID before re-raising ensures traceability while preserving the original exception behavior.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2025-12-16T07:53:54Z

Summary of Changes

Hello @Brijesh-Thakkar, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the observability and integration capabilities of the system by introducing graph-level webhooks. Graph templates can now be configured to send notifications to external systems upon successful execution or failure, providing real-time updates on their lifecycle. The implementation ensures that these notifications are dispatched asynchronously, maintaining system responsiveness, and includes a more robust approach to cleaning up old trigger data.

Highlights

Graph-Level Webhooks: Introduced support for webhooks at the graph template level to notify external systems about graph execution lifecycle events.
Execution Failure Notification: A GRAPH_FAILED webhook is now dispatched when a graph execution fails and no retries are remaining.
Webhook Configuration Model: A new WebhookConfig model has been added to define webhook URL, subscribed events, and optional headers.
Asynchronous Dispatch: Webhooks are dispatched asynchronously using BackgroundTasks to prevent blocking the main request flow.
Trigger Cleanup Enhancement: Improved the cleanup mechanism for old CRON triggers by marking them as CANCELLED with an expires_at timestamp instead of immediate deletion, leveraging MongoDB's TTL feature.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a valuable feature for observability by adding support for graph-level webhooks, specifically for graph failure events. The implementation is mostly solid, with a robust background task for dispatching webhooks and a significant improvement in how old CRON triggers are handled (moving from deletion to a TTL-based cancellation). However, there are a couple of critical issues. The new webhook configuration isn't being saved to the database when a graph template is created or updated, which prevents the feature from working as intended. Also, the handling of BackgroundTasks in errored_state could lead to blocking calls. I've added specific comments with suggestions to fix these. Additionally, the PR description mentions a webhook for successful graph execution (STATE_EXECUTED), but this does not appear to be implemented yet.

state-manager/app/controller/upsert_graph_template.py

state-manager/app/controller/errored_state.py

coderabbitai

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

state-manager/app/tasks/trigger_cron.py (1)

86-95: Clarify the comment about enum serialization and address code inconsistency.

Line 93 uses TriggerStatusEnum.PENDING directly (without .value) with a comment claiming "insert() converts." However, other functions in this file consistently use .value when updating trigger_status (lines 24, 46, 62, 112).

The comment is technically correct but misleading: When an enum class inherits from str, Pydantic automatically serializes enum values as strings. Since Beanie is built on Pydantic, Beanie Documents are abstractions over Pydantic BaseModel that allow working with Python objects at the application level and JSON objects at the database level. The serialization is handled by Pydantic, not explicitly by insert().

For consistency, either use .value explicitly throughout the file (as in other methods) or replace the comment with a clearer explanation: "OK because TriggerStatusEnum inherits from str and Pydantic serializes it automatically."

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9aa2e1a and 6f9c901.

📒 Files selected for processing (10)

state-manager/app/config/settings.py (1 hunks)
state-manager/app/controller/errored_state.py (2 hunks)
state-manager/app/controller/executed_state.py (1 hunks)
state-manager/app/controller/upsert_graph_template.py (3 hunks)
state-manager/app/models/db/graph_template_model.py (3 hunks)
state-manager/app/models/db/trigger.py (1 hunks)
state-manager/app/models/webhook_config_model.py (1 hunks)
state-manager/app/tasks/init_tasks.py (1 hunks)
state-manager/app/tasks/trigger_cron.py (5 hunks)
state-manager/app/tasks/webhook.py (1 hunks)

🧰 Additional context used

🧠 Learnings (4)

📚 Learning: 2025-10-22T14:23:18.774Z

Learnt from: NiveditJain
Repo: exospherehost/exospherehost PR: 464
File: state-manager/app/tasks/init_tasks.py:16-20
Timestamp: 2025-10-22T14:23:18.774Z
Learning: In state-manager/app/tasks/init_tasks.py, the init_tasks() function intentionally uses asyncio.gather with a list pattern (even with a single task) to provide an extensible structure for easily adding future startup tasks without refactoring.

Applied to files:

state-manager/app/tasks/init_tasks.py

📚 Learning: 2025-09-28T13:35:42.862Z

Learnt from: NiveditJain
Repo: exospherehost/exospherehost PR: 428
File: state-manager/app/tasks/verify_graph.py:4-5
Timestamp: 2025-09-28T13:35:42.862Z
Learning: In the cron trigger scheduling logic for state-manager/app/tasks/verify_graph.py, the system intentionally schedules at least one trigger beyond the trigger_ahead_time window to ensure continuity of scheduled executions. The current logic of appending an event then breaking is by design to guarantee "at least one next" trigger.

Applied to files:

state-manager/app/tasks/trigger_cron.py

📚 Learning: 2025-08-03T16:46:04.030Z

Learnt from: NiveditJain
Repo: exospherehost/exospherehost PR: 132
File: state-manager/app/controller/upsert_graph_template.py:11-27
Timestamp: 2025-08-03T16:46:04.030Z
Learning: In the exospherehost codebase, for upsert operations on graph templates, the team prioritizes API idempotency over avoiding race conditions in the database layer implementation. The approach of separate find and insert/update operations is acceptable when the API behavior remains consistent.

Applied to files:

state-manager/app/controller/upsert_graph_template.py

📚 Learning: 2025-08-06T14:04:44.515Z

Learnt from: NiveditJain
Repo: exospherehost/exospherehost PR: 158
File: state-manager/app/controller/upsert_graph_template.py:49-49
Timestamp: 2025-08-06T14:04:44.515Z
Learning: In the exospherehost codebase, the team prefers to run graph template verification on every upsert operation rather than optimizing by skipping verification when templates haven't changed, prioritizing comprehensive validation over performance optimization.

Applied to files:

state-manager/app/controller/upsert_graph_template.py

🧬 Code graph analysis (6)

state-manager/app/models/db/graph_template_model.py (1)

state-manager/app/models/webhook_config_model.py (1)

WebhookConfig (5-11)

state-manager/app/tasks/init_tasks.py (4)

state-manager/app/config/settings.py (1)

get_settings (34-38)

state-manager/app/models/db/trigger.py (1)

DatabaseTriggers (9-53)

state-manager/app/models/trigger_models.py (1)

TriggerStatusEnum (9-14)

state-manager/app/singletons/logs_manager.py (2)

LogsManager (9-66)

get_logger (65-66)

state-manager/app/tasks/trigger_cron.py (2)

state-manager/app/models/db/trigger.py (1)

DatabaseTriggers (9-53)

state-manager/app/models/trigger_models.py (1)

TriggerStatusEnum (9-14)

state-manager/app/controller/errored_state.py (3)

state-manager/app/tasks/webhook.py (1)

dispatch_webhook (10-32)

state-manager/app/singletons/logs_manager.py (2)

LogsManager (9-66)

get_logger (65-66)

state-manager/app/models/errored_models.py (2)

ErroredRequestModel (5-6)

ErroredResponseModel (9-11)

state-manager/app/models/db/trigger.py (1)

state-manager/app/models/trigger_models.py (1)

TriggerStatusEnum (9-14)

state-manager/app/controller/upsert_graph_template.py (5)

state-manager/app/models/db/graph_template_model.py (3)

GraphTemplate (19-349)

set_secrets (271-273)

get_secrets (275-278)

state-manager/app/tasks/verify_graph.py (1)

verify_graph (134-166)

state-manager/app/models/db/trigger.py (1)

DatabaseTriggers (9-53)

state-manager/app/models/trigger_models.py (2)

TriggerStatusEnum (9-14)

TriggerTypeEnum (6-7)

state-manager/app/config/settings.py (1)

get_settings (34-38)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Agent

🔇 Additional comments (11)

state-manager/app/models/db/trigger.py (1)

43-52: LGTM! TTL index properly extended for CANCELLED triggers.

The addition of CANCELLED to the TTL cleanup policy aligns with the new soft-cancellation workflow introduced in trigger_cron.py. This enables automatic cleanup of cancelled triggers after the retention period expires.

state-manager/app/models/db/graph_template_model.py (1)

17-17: LGTM! Webhook field properly added to GraphTemplate.

The addition of the optional webhook field with the WebhookConfig type enables graph-level webhook configuration as intended by the PR. The field is appropriately optional and well-documented.

Also applies to: 29-29

state-manager/app/tasks/trigger_cron.py (1)

52-75: LGTM! Cancellation helpers properly implement soft-delete with TTL.

The mark_as_cancelled and cancel_trigger functions correctly implement the soft-cancellation workflow:

Sets trigger_status to CANCELLED using .value for MongoDB compatibility

Sets expires_at for TTL-based automatic cleanup

Aligns with the extended TTL index in trigger.py

The wrapper pattern (cancel_trigger → mark_as_cancelled) provides a clean public API for other modules to use.

state-manager/app/tasks/init_tasks.py (2)

1-11: LGTM! Imports and logger setup are appropriate.

The imports cover all necessary dependencies for datetime operations, settings access, database models, and logging. The module-level logger initialization follows the codebase pattern.

51-54: LGTM!

The asyncio.gather pattern with a single task is intentionally maintained for extensibility, consistent with the codebase convention. Based on learnings, this structure allows easily adding future startup tasks without refactoring.

state-manager/app/controller/upsert_graph_template.py (6)

1-14: LGTM!

Imports are well-organized and include all necessary dependencies for the new functionality (datetime utilities for TTL calculation, get_settings for runtime configuration).

17-32: LGTM!

Loading settings at request time ensures runtime configuration changes (like trigger_retention_hours) are picked up without requiring a restart. The function signature correctly includes the request ID for traceability.

34-72: LGTM!

The upsert logic correctly handles both update and insert paths. Capturing old_triggers before the update ensures proper cleanup of stale CRON triggers. Based on learnings, the separate find and insert/update approach is acceptable for maintaining API idempotency.

73-82: LGTM!

The ValueError handling correctly logs the error with request context and returns an appropriate 400 status code for validation failures.

84-115: LGTM! Correct use of soft cancellation for PENDING triggers.

The bulk update correctly targets only PENDING triggers (those scheduled but not yet fired), which is appropriate when replacing triggers during upsert. This differs intentionally from init_tasks.py which targets TRIGGERED/FAILED legacy documents for cleanup.

The namespace filter ("namespace": namespace_name) ensures isolation between namespaces—good defensive practice.

117-141: LGTM!

The response correctly masks secret values (returning True for presence rather than actual values), which is a good security practice. The outer exception handler appropriately logs with context and re-raises to preserve the original traceback.

state-manager/app/controller/errored_state.py

state-manager/app/controller/executed_state.py

state-manager/app/models/db/graph_template_model.py

state-manager/app/models/webhook_config_model.py

state-manager/app/tasks/init_tasks.py

state-manager/app/tasks/webhook.py

Copilot

Pull request overview

This PR introduces graph-level webhook support for notifying external systems about graph execution lifecycle events, specifically for failure scenarios. However, the implementation is incomplete as described below.

Key Changes

Added WebhookConfig model and webhook field to GraphTemplate for storing webhook configurations
Implemented GRAPH_FAILED webhook dispatch in errored_state.py when a graph fails with no retries remaining
Refactored trigger management to use CANCELLED status instead of deleting triggers, enabling TTL-based cleanup

Reviewed changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 11 comments.

Show a summary per file

File	Description
state-manager/app/tasks/webhook.py	New webhook dispatch function with best-effort delivery semantics
state-manager/app/models/webhook_config_model.py	New WebhookConfig model defining URL, events, and optional headers
state-manager/app/models/db/graph_template_model.py	Added optional webhook field to GraphTemplate; minor spacing issue
state-manager/app/controller/upsert_graph_template.py	Refactored to cancel old triggers instead of deleting; webhook not saved to database
state-manager/app/controller/errored_state.py	Added GRAPH_FAILED webhook dispatch; uses deprecated datetime method
state-manager/app/controller/executed_state.py	Code formatting improvements; missing STATE_EXECUTED webhook
state-manager/app/tasks/trigger_cron.py	Fixed enum value usage; added cancel helper functions
state-manager/app/tasks/init_tasks.py	Changed to mark old triggers as CANCELLED instead of deleting
state-manager/app/models/db/trigger.py	Updated TTL index to include CANCELLED status
state-manager/app/config/settings.py	Minor whitespace formatting

Comments suppressed due to low confidence (1)

state-manager/app/tasks/webhook.py:2

Import of 'datetime' is not used.

from datetime import datetime

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

state-manager/app/tasks/webhook.py

state-manager/app/models/db/graph_template_model.py

state-manager/app/controller/upsert_graph_template.py

state-manager/app/controller/errored_state.py

state-manager/app/tasks/webhook.py

state-manager/app/tasks/trigger_cron.py

state-manager/app/controller/errored_state.py

state-manager/app/tasks/webhook.py

state-manager/app/controller/executed_state.py

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (8)

state-manager/app/models/webhook_config_model.py (1)

5-11: Add validation for webhook configuration fields.

The model lacks essential validation that could prevent runtime errors:

URL validation: The url field accepts any string. Use Pydantic's HttpUrl type or add a validator to ensure it's a valid URL.

Event validation: The events list accepts arbitrary strings. Consider using an enum or validator to restrict to known event types (e.g., GRAPH_FAILED, STATE_EXECUTED).

Headers security: No documentation warns that headers may contain sensitive authentication tokens that could be logged or exposed.

Based on past review comments, these validation gaps remain unresolved.

state-manager/app/models/db/graph_template_model.py (1)

323-323: Missing space after comma.

The find_one call lacks a space after the comma between arguments, which is inconsistent with PEP 8 style guidelines.

state-manager/app/tasks/webhook.py (3)

2-2: Remove unused import.

The datetime import is not used anywhere in this file.

7-7: Consider using LogsManager for consistency.

This module uses logging.getLogger(__name__) while other modules in the codebase (e.g., errored_state.py, trigger_cron.py) use LogsManager().get_logger() for structured logging. Using LogsManager would ensure consistent log formatting and configuration across the codebase.

Based on past review comments highlighting consistency concerns.

10-26: Consider making webhook timeout configurable and logging response status.

Two recommendations:

Hardcoded timeout: The 5-second timeout may not suit all webhook endpoints. Consider accepting a configurable timeout parameter (with 5 seconds as default) or reading from application settings.

Missing response logging: The function doesn't log or check the HTTP response status code. Consider logging successful dispatches (at debug or info level) and non-2xx responses (at warning level) to improve observability and debugging.

Based on past review comments regarding timeout configurability and response status logging.

state-manager/app/controller/executed_state.py (1)

15-95: Missing STATE_EXECUTED webhook dispatch mentioned in PR objectives.

The PR summary states: "Dispatches STATE_EXECUTED webhook on successful execution," but this file contains no webhook dispatch logic. The errored_state.py controller implements GRAPH_FAILED webhook dispatch (checking webhook config, using BackgroundTasks, calling dispatch_webhook), but executed_state lacks equivalent functionality.

To implement STATE_EXECUTED webhooks consistently with the existing GRAPH_FAILED implementation, you would need to:

Fetch the graph template and check for webhook configuration

Conditionally dispatch the webhook when "STATE_EXECUTED" is in the configured events

Include relevant payload data (namespace, graph_name, run_id, state_id(s), outputs, timestamp)

Should this functionality be added, or was it intentionally deferred?

Based on past review comments and PR objectives.
state-manager/app/controller/errored_state.py (2)
17-19: BackgroundTasks instantiation will not execute tasks.

Creating a new BackgroundTasks() instance when None is passed will not work as expected. BackgroundTasks must be managed by FastAPI's dependency injection system to ensure tasks are actually executed after the response is sent. If background_tasks is None, the webhook dispatch on lines 81-95 will be scheduled but never run.

Consider one of these solutions:

Option 1: Make background_tasks required
-async def errored_state(namespace_name: str, state_id: PydanticObjectId, body: ErroredRequestModel, x_exosphere_request_id: str, background_tasks: BackgroundTasks | None = None,) -> ErroredResponseModel:
-    if background_tasks is None:
-        background_tasks = BackgroundTasks()
+async def errored_state(
+    namespace_name: str,
+    state_id: PydanticObjectId,
+    body: ErroredRequestModel,
+    x_exosphere_request_id: str,
+    background_tasks: BackgroundTasks,
+) -> ErroredResponseModel:
Option 2: Skip webhook dispatch if background_tasks is None
     if (
         not retry_created
         and graph_template.webhook
         and "GRAPH_FAILED" in graph_template.webhook.events
+        and background_tasks is not None
     ):
Based on past review comments highlighting that FastAPI-managed BackgroundTasks is required for task execution.

92-92: Replace deprecated datetime.utcnow() with timezone-aware alternative.

Line 92 uses datetime.utcnow(), which is deprecated in Python 3.12+ in favor of datetime.now(timezone.utc). The timezone-aware version is more explicit and aligns with modern Python best practices and is used consistently elsewhere in the codebase (e.g., trigger_cron.py, init_tasks.py, upsert_graph_template.py).

Apply this diff:
+from datetime import datetime, timezone
-                    "timestamp": datetime.utcnow().isoformat(),
+                    "timestamp": datetime.now(timezone.utc).isoformat(),
Based on past review comments regarding deprecated datetime usage.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6f9c901 and a82709e.

📒 Files selected for processing (5)

state-manager/app/controller/errored_state.py (2 hunks)
state-manager/app/controller/executed_state.py (1 hunks)
state-manager/app/models/db/graph_template_model.py (3 hunks)
state-manager/app/models/webhook_config_model.py (1 hunks)
state-manager/app/tasks/webhook.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (3)

state-manager/app/models/db/graph_template_model.py (1)

state-manager/app/models/webhook_config_model.py (1)

WebhookConfig (5-11)

state-manager/app/controller/executed_state.py (5)

state-manager/app/models/executed_models.py (2)

ExecutedRequestModel (5-6)

ExecutedResponseModel (9-10)

state-manager/app/models/db/state.py (2)

State (13-106)

insert_many (57-63)

state-manager/app/models/state_status_enum.py (1)

StateStatusEnum (4-20)

state-manager/app/singletons/logs_manager.py (2)

LogsManager (9-66)

get_logger (65-66)

state-manager/app/tasks/create_next_states.py (1)

create_next_states (89-250)

state-manager/app/controller/errored_state.py (2)

state-manager/app/tasks/webhook.py (1)

dispatch_webhook (10-32)

state-manager/app/models/errored_models.py (2)

ErroredRequestModel (5-6)

ErroredResponseModel (9-11)

state-manager/app/controller/executed_state.py

Copilot AI review requested due to automatic review settings December 16, 2025 07:53

Copilot started reviewing on behalf of Brijesh-Thakkar December 16, 2025 07:53 View session

coderabbitai bot added the enhancement New feature or request label Dec 16, 2025

gemini-code-assist bot reviewed Dec 16, 2025

View reviewed changes

state-manager/app/controller/upsert_graph_template.py Outdated Show resolved Hide resolved

state-manager/app/controller/errored_state.py Show resolved Hide resolved

coderabbitai bot reviewed Dec 16, 2025

View reviewed changes

Copilot AI reviewed Dec 16, 2025

View reviewed changes

Add graph execution webhooks for failure and execution events

a82709e

Brijesh-Thakkar force-pushed the fix/graph-failed-webhook branch from 6f9c901 to a82709e Compare December 16, 2025 16:00

coderabbitai bot reviewed Dec 16, 2025

View reviewed changes

state-manager/app/controller/executed_state.py Outdated Show resolved Hide resolved

Removed unused imports

125ef16

Add graph-level webhooks for execution failure and completion #607

Are you sure you want to change the base?

Add graph-level webhooks for execution failure and completion #607

Uh oh!

Conversation

Brijesh-Thakkar commented Dec 16, 2025

Uh oh!

coderabbitai bot commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

gemini-code-assist bot commented Dec 16, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Key Changes

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai bot commented Dec 16, 2025 •

edited

Loading