graph TB
subgraph Entrypoints
IDX[index.ts<br/>Express app]
WH[api/webhooks.ts<br/>GitHub webhook]
end
subgraph API Layer
MW[api/middleware.ts<br/>HMAC auth]
RQ[api/requeue.ts<br/>Requeue + force-release]
ST[api/status.ts<br/>Status + dead-letter]
end
subgraph Validation
INT[validation/intake.ts<br/>Normalize + filter]
PIPE[validation/pipeline.ts<br/>Orchestrator]
MED[validation/media.ts<br/>URL + HEAD checks]
VER[validation/verdict.ts<br/>Publish verdict]
POL[validation/poller.ts<br/>Missed webhooks]
end
subgraph Detection
SP[detection/spam.ts<br/>Template + burst + parity]
DUP[detection/duplicate.ts<br/>Jaccard + cosine hybrid]
EH[detection/edit-history.ts<br/>Fraud detection]
EMB[detection/embeddings.ts<br/>Qwen3 via OpenRouter]
LLM[detection/llm-scorer.ts<br/>Gemini 3.1 Pro]
end
subgraph Queue
PROC[queue/processor.ts<br/>Retry loop]
DL[queue/dead-letter.ts<br/>Dead-letter mgmt]
RR[queue/requeue-recovery.ts<br/>Resolve requeues]
end
subgraph Infrastructure
CFG[config.ts]
DB[db/schema.ts + index.ts<br/>SQLite]
RD[redis.ts<br/>Locking + idempotency]
GHC[github/client.ts<br/>Fetch-based]
GHM[github/mutations.ts<br/>Labels + comments]
HMAC[hmac.ts<br/>HMAC-SHA256]
WC[webhook-client.ts<br/>Atlas callbacks]
LOG[logger.ts]
HS[health-server.ts]
end
IDX --> MW --> RQ & ST
IDX --> WH
IDX --> HS
WH --> INT
POL --> INT
RQ --> PROC
INT --> DB & RD
PROC --> PIPE
PIPE --> MED & SP & DUP & EH & LLM
DUP --> EMB
SP --> GHC
EH --> GHC
PIPE --> GHC
PIPE --> VER
VER --> GHM & DB & WC
PROC --> DL
DL --> DB & WC
RR --> DB & WC
WC --> HMAC
sequenceDiagram
participant GH as GitHub
participant WH as Webhook Router
participant INT as Intake
participant RD as Redis
participant DB as SQLite
participant Q as Queue Processor
participant PIPE as Pipeline
participant MED as Media Check
participant SP as Spam Detector
participant DUP as Duplicate Detector
participant EH as Edit History
participant LLM as LLM Scorer
participant VER as Verdict Engine
participant ATL as Atlas
GH->>WH: POST /api/v1/webhooks/github (issues.opened)
WH->>WH: Verify X-Hub-Signature-256
WH->>INT: normalizeIssue + shouldProcess
INT->>INT: Filter (issue ≥ #41000, no terminal labels)
INT->>RD: acquireLock(bounty:{n})
INT->>RD: checkIdempotencyKey
INT->>DB: upsertBounty(pending)
INT->>RD: setIdempotencyKey
INT->>RD: releaseLock
WH-->>GH: 202 Accepted
Note over Q: Queue processor loop (5 s interval)
Q->>RD: acquireLock(queue:{n})
Q->>DB: lockBounty → in_progress
Q->>PIPE: runValidationPipeline(n)
PIPE->>GH: getIssue(n)
PIPE->>MED: validateMedia(body)
MED->>MED: extractURLs + HEAD requests
MED-->>PIPE: hasMedia, accessible
PIPE->>SP: analyzeSpam(issue)
SP->>GH: listRecentIssues (2 h window)
SP->>SP: template + burst + parity scores
SP-->>PIPE: overallScore
alt Borderline spam (0.3–0.7)
PIPE->>LLM: scoreSpamLikelihood
LLM-->>PIPE: blended score
end
PIPE->>DUP: findDuplicates(issue)
DUP->>DB: getAllEmbeddings
DUP->>DUP: Jaccard similarity
opt Embeddings available
DUP->>DUP: cosine similarity → hybrid 0.4J + 0.6C
end
DUP->>DB: upsertEmbedding
DUP-->>PIPE: isDuplicate, similarity
PIPE->>EH: analyzeEditHistory(n)
EH->>GH: getIssueEvents(n)
EH->>EH: rapid edits, title renames, body edits
EH-->>PIPE: fraudScore
PIPE->>LLM: scoreIssueValidity (final gate)
LLM->>LLM: Gemini deliver_verdict function call
LLM-->>PIPE: score, reasoning
PIPE-->>Q: VerdictResult
Q->>VER: publishVerdict(n, result)
VER->>GH: applyVerdict (labels + comment + close/reopen)
VER->>DB: insertValidationResult
VER->>DB: updateBountyStatus(completed)
VER->>ATL: webhook callback (validation.completed)
VER->>DB: insertAuditLog
Q->>DB: unlockBounty
Q->>RD: releaseLock
Bounty-bot uses SQLite (via better-sqlite3) with 8 tables:
Primary tracking table — one row per GitHub issue under validation.
| Column | Type | Notes |
|---|---|---|
issue_number |
INTEGER PK | GitHub issue number |
workspace_id |
TEXT UNIQUE | External workspace reference |
repo |
TEXT | owner/repo |
title |
TEXT | Issue title |
body |
TEXT | Issue body |
author |
TEXT | GitHub login |
created_at |
TEXT | ISO timestamp |
status |
TEXT | pending → in_progress → completed / dead_lettered |
verdict |
TEXT | valid / invalid / duplicate |
labels |
TEXT | Comma-separated label names |
locked_by |
TEXT | Lock owner identifier |
locked_at |
TEXT | Lock acquisition time |
lock_expires_at |
TEXT | Lock expiry time |
retry_count |
INTEGER | Current retry attempt |
max_retries |
INTEGER | Configured max (default 3) |
updated_at |
TEXT | Last modification |
Verdicts produced by the validation pipeline.
| Column | Type | Notes |
|---|---|---|
id |
INTEGER PK | Auto-increment |
issue_number |
INTEGER FK | → bounties |
workspace_id |
TEXT | |
verdict |
TEXT | valid / invalid / duplicate |
rationale |
TEXT | Human-readable explanation |
evidence |
TEXT | JSON blob |
spam_score |
REAL | 0–1 |
duplicate_of |
INTEGER | Original issue number |
media_check |
TEXT | JSON blob |
created_at |
TEXT |
Per-issue spam scoring breakdown.
| Column | Type | Notes |
|---|---|---|
id |
INTEGER PK | |
issue_number |
INTEGER | |
template_score |
REAL | Jaccard similarity to author's recent issues |
burst_score |
REAL | Submission frequency score |
parity_score |
REAL | Content quality score |
overall_score |
REAL | Weighted combination |
details |
TEXT | Human-readable breakdown |
created_at |
TEXT |
Vector fingerprints for duplicate detection.
| Column | Type | Notes |
|---|---|---|
id |
INTEGER PK | |
issue_number |
INTEGER UNIQUE | |
title_fingerprint |
TEXT | SHA-256 of title n-grams |
body_fingerprint |
TEXT | Combined title+body text |
embedding_vector |
BLOB | JSON-encoded float array (Qwen3) |
created_at |
TEXT |
Manual or automated re-validation requests.
| Column | Type | Notes |
|---|---|---|
id |
INTEGER PK | |
issue_number |
INTEGER FK | → bounties |
requester_id |
TEXT | Who requested the requeue |
requester_context |
TEXT | JSON context |
status |
TEXT | pending → completed |
requeued_at |
TEXT | |
completed_at |
TEXT | |
callback_sent |
INTEGER | 0/1 |
Bounties that exhausted all retry attempts.
| Column | Type | Notes |
|---|---|---|
id |
INTEGER PK | |
bounty_id |
INTEGER | Issue number |
failure_cause |
TEXT | Error message |
last_attempt |
TEXT | |
metadata |
TEXT | JSON blob |
retry_count |
INTEGER | |
created_at |
TEXT |
Digest delivery de-duplication (workspace + window).
Immutable action trail — every verdict, requeue, dead-letter, and force-release is logged.
| Column | Type | Notes |
|---|---|---|
id |
INTEGER PK | |
workspace_id |
TEXT | |
action |
TEXT | e.g. verdict.valid, bounty.requeued |
actor |
TEXT | bounty-bot, requester ID, or system |
details |
TEXT | JSON blob |
github_ref |
TEXT | e.g. #41234 |
discord_ref |
TEXT | |
created_at |
TEXT |
Bounty-bot uses an in-memory queue backed by Redis distributed locking and SQLite persistence:
- Intake normalises and filters the issue, acquires a Redis lock, checks idempotency, and upserts into
bountieswith statuspending. - Queue Processor runs on a 5-second interval. It shifts an entry from the in-memory queue, acquires a Redis lock (
lock:queue:{n}), locks the bounty row in SQLite, and runs the full validation pipeline. - Retries — on pipeline failure, the entry is re-enqueued with an incremented
retryCount. AfterMAX_RETRIES(default 3), the issue moves to the dead-letter queue. - Dead Letter — inserts into the
dead_lettertable, updates bounty status todead_lettered, sends avalidation.failedwebhook to Atlas, and logs to the audit trail. Dead-lettered items can be manually recovered via the REST API. - Requeue Recovery — a 30-second interval scheduler checks pending requeue records. When the underlying bounty reaches a terminal status (
completedordead_lettered), it marks the requeue record as completed and sends avalidation.completedcallback to Atlas. - Poller — runs on a configurable interval (default 60 s) to catch issues that were opened while the webhook endpoint was unreachable. Fetches recent issues from the GitHub API and passes them through the intake pipeline.