Skip to content

feat: Batching + rate limiting for webhook notifications #181

@nexus-marbell

Description

@nexus-marbell

Summary

Add batching and rate limiting to the webhook notification pipeline to handle high-volume repos gracefully. When many events arrive in quick succession (e.g., a CI run creating multiple check events, or a batch of issues being filed), collapse them into summary messages instead of flooding agents.

RFC: #177 (GitHub Webhook to Swarm Notification Bridge)

Design

Cooldown / Batching

  • Track events per repo within a rolling time window
  • If 5+ events from the same repo arrive within 60 seconds, batch them into a single summary message
  • Summary format: [GitHub] 7 events on finml-sage/repo in the last 60s: 3 issue comments, 2 PR reviews, 2 issues opened
  • After the batch is sent, reset the counter for that repo
  • Use an in-memory buffer (dict of repo -> event list with timestamps)

Rate Limiting

  • Global rate limit: maximum N webhook-triggered swarm messages per minute (configurable, default 30)
  • Per-repo rate limit: maximum M messages per minute per repo (configurable, default 10)
  • When rate limited, queue events and send a summary when the window resets
  • Prevents webhook abuse (accidental or intentional flood of events)

Configuration

  • Sensible defaults that work without configuration
  • Optional env var overrides for tuning:
    • GITHUB_WEBHOOK_BATCH_THRESHOLD (default: 5 events)
    • GITHUB_WEBHOOK_BATCH_WINDOW (default: 60 seconds)
    • GITHUB_WEBHOOK_RATE_LIMIT (default: 30 messages/minute global)

Implementation Notes

  • Use asyncio primitives for the batching timer (not threads)
  • The batch buffer should be cleaned up periodically to prevent memory leaks from repos that go quiet
  • Batch summary should still include the [GitHub] prefix for consistency
  • Individual high-priority events (e.g., new external contributor's first issue) could bypass batching in a future enhancement

Acceptance Criteria

  • Events from the same repo within the batch window are collapsed into a summary
  • Batch threshold is configurable (default 5 events)
  • Batch window is configurable (default 60 seconds)
  • Global rate limit prevents more than N messages per minute
  • Per-repo rate limit prevents flood from a single repo
  • Batch summaries include event type counts (e.g., "3 issue comments, 2 PRs")
  • Batch summaries include the [GitHub] prefix
  • In-memory buffer is cleaned up for inactive repos
  • Uses asyncio primitives (no threads)
  • Unit tests cover: batching trigger, batch summary format, rate limit enforcement, buffer cleanup

Dependencies

Depends on #180 (event formatting -- batching wraps around the formatted messages).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions