Skip to content

feat(sentry apps): Add circuit breaker into webhook code#111723

Merged
Christinarlong merged 36 commits intomasterfrom
crl/add-breaker-to-webhook-sending
Mar 31, 2026
Merged

feat(sentry apps): Add circuit breaker into webhook code#111723
Christinarlong merged 36 commits intomasterfrom
crl/add-breaker-to-webhook-sending

Conversation

@Christinarlong
Copy link
Copy Markdown
Contributor

@Christinarlong Christinarlong commented Mar 27, 2026

NOTES

  • This PR is in a stacked chain of
    crl/add-breaker-to-webhook-sending -> crl/add-ff-to-breaker -> crl/impl-circuit-breaker -> crl/webhook-alarm -> master
  • Turn on hide whitespace 🙏
  • New PR since the old one got closed for staleness (link to old PR)

Summary

Adds a per-app-slug circuit breaker to send_and_save_webhook_request to track
temporarily block outbound webhook delivery to consistently failing Sentry Apps.

  • Config: Trips when ≥50% of requests in a 10-minute window are errors and error floor is ≥500. Once tripped, all requests to that app are blocked for 5 minutes.
  • Feature flag (organizations:sentry-app-webhook-circuit-breaker): gates the breaker
    per-org.
  • Dry-run mode (in option form): when enabled, the breaker tracks state and emits sentry_app.webhook.circuit_breaker.would_block metrics/logs but does not block requests. Part of rollout to test accuracy.
  • circuit_breaker_tracking context manager: Context manager to record a fail upon exception (this includes ALL exceptions not just the webhook alarm timeout).

With Exception, ignore_unpublished_app_errors swallows timeouts for
unpublished (dev/test) apps, which is the correct behavior — the
lifecycle halt metric still fires before the re-raise. Tests updated
to use published=True so the exception propagates for assertion.
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Mar 27, 2026
Comment thread src/sentry/utils/sentry_apps/webhooks.py
Comment thread src/sentry/utils/sentry_apps/webhooks.py Outdated
Comment thread src/sentry/utils/sentry_apps/circuit_breaker.py Outdated
@github-actions

This comment was marked as outdated.

Copy link
Copy Markdown
Member

@shashjar shashjar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left a couple minor comments

Comment thread src/sentry/utils/sentry_apps/webhooks.py
Comment thread tests/sentry/utils/sentry_apps/test_webhooks.py
Comment thread src/sentry/utils/sentry_apps/request_buffer.py Outdated
Comment thread src/sentry/utils/sentry_apps/circuit_breaker.py
Comment thread tests/sentry/utils/sentry_apps/test_webhooks.py Outdated
Comment thread src/sentry/utils/sentry_apps/circuit_breaker.py
@github-actions

This comment was marked as outdated.

return Response()

with circuit_breaker_tracking(circuit_breaker):
response = _send_webhook_request(url, app_platform_event, organization_context)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Circuit breaker errors can crash webhook delivery for published apps

Medium Severity

_create_circuit_breaker and _circuit_breaker_allows_request (which calls should_allow_request()) are invoked inside a try block that only catches specific exception types (WebhookTimeoutError, Timeout, ConnectionError, etc.). An unexpected exception from these — such as a Redis failure inside should_allow_request()_should_trip()check_within_quotas(), or a KeyError from a malformed options config — would propagate uncaught, causing the webhook to fail for published apps. This is inconsistent with circuit_breaker_tracking, which carefully wraps record_error() and record_success() in try/except Exception to prevent circuit breaker infrastructure issues from breaking webhook delivery.

Additional Locations (1)
Fix in Cursor Fix in Web

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, we have retries for uncaught errors at this step so idt it's that big of a deal

Comment thread src/sentry/options/defaults.py
Comment thread src/sentry/utils/sentry_apps/circuit_breaker.py
Comment thread src/sentry/utils/sentry_apps/circuit_breaker.py Outdated
Comment thread src/sentry/utils/sentry_apps/webhooks.py
Copy link
Copy Markdown
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment thread src/sentry/utils/sentry_apps/circuit_breaker.py
@Christinarlong Christinarlong merged commit bb1eb15 into master Mar 31, 2026
129 of 133 checks passed
@Christinarlong Christinarlong deleted the crl/add-breaker-to-webhook-sending branch March 31, 2026 18:25
@github-actions github-actions bot locked and limited conversation to collaborators Apr 16, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants