|
1 | | -# Bounty Bot |
| 1 | +# Bounty-bot |
2 | 2 |
|
3 | | -GitHub bounty validation bot - controlled by Atlas via API. |
| 3 | +Automated GitHub bounty issue validation service. Bounty-bot receives GitHub webhook events (or polls for missed issues), runs every submission through a multi-stage detection pipeline — media checks, spam analysis, duplicate detection, edit-history fraud, and LLM-assisted scoring — then publishes a verdict back to GitHub with labels, comments, and issue state changes. |
| 4 | + |
| 5 | +Controlled by **Atlas** via a REST API with HMAC authentication. Callbacks are sent to Atlas on completion or failure. |
4 | 6 |
|
5 | 7 | ## Architecture |
6 | 8 |
|
7 | | -- Controlled by [PlatformNetwork/atlas](https://github.com/PlatformNetwork/atlas) |
8 | | -- REST API on port 3235 |
9 | | -- SQLite persistence |
10 | | -- GitHub integration for issue validation |
| 9 | +```mermaid |
| 10 | +flowchart TD |
| 11 | + GH[GitHub Webhook] -->|issues.opened| WH[Webhook Receiver] |
| 12 | + POLL[Poller] -->|missed events| INT[Intake] |
| 13 | + ATLAS[Atlas API call] -->|trigger| INT |
| 14 | + WH --> INT |
| 15 | + INT -->|filter ≥ #41000| Q[Queue] |
| 16 | + Q --> PIPE[Validation Pipeline] |
| 17 | +
|
| 18 | + subgraph Pipeline |
| 19 | + PIPE --> MEDIA[Media Check] |
| 20 | + MEDIA --> SPAM[Spam Detection] |
| 21 | + SPAM --> DUP[Duplicate Detection] |
| 22 | + DUP --> EDIT[Edit History] |
| 23 | + EDIT --> LLM[LLM Validity Gate] |
| 24 | + end |
| 25 | +
|
| 26 | + LLM --> VERD[Verdict Engine] |
| 27 | + VERD -->|labels + comment| GH_MUT[GitHub Mutations] |
| 28 | + VERD -->|webhook callback| ATLAS_CB[Atlas Callback] |
| 29 | + VERD -->|persist| DB[(SQLite)] |
| 30 | +
|
| 31 | + Q -->|max retries| DL[Dead Letter] |
| 32 | +``` |
| 33 | + |
| 34 | +## Quick Start |
| 35 | + |
| 36 | +### Prerequisites |
| 37 | + |
| 38 | +| Dependency | Version | |
| 39 | +|---|---| |
| 40 | +| Node.js | ≥ 20 | |
| 41 | +| Redis | any (default port 3231) | |
| 42 | +| Docker + Compose | optional, for containerised deployment | |
| 43 | + |
| 44 | +### Environment Variables |
| 45 | + |
| 46 | +Create a `.env` file (see [Configuration](#configuration) for the full list): |
| 47 | + |
| 48 | +```env |
| 49 | +GITHUB_TOKEN=ghp_... |
| 50 | +GITHUB_WEBHOOK_SECRET=your-webhook-secret |
| 51 | +INTER_SERVICE_HMAC_SECRET=shared-secret-with-atlas |
| 52 | +REDIS_URL=redis://localhost:3231 |
| 53 | +OPENROUTER_API_KEY=sk-or-... |
| 54 | +ATLAS_WEBHOOK_URL=http://localhost:3230/webhooks |
| 55 | +TARGET_REPO=PlatformNetwork/bounty-challenge |
| 56 | +``` |
| 57 | + |
| 58 | +### Docker Compose |
| 59 | + |
| 60 | +```bash |
| 61 | +docker compose up -d |
| 62 | +``` |
| 63 | + |
| 64 | +The service starts on **port 3235** with SQLite data persisted in a Docker volume. |
| 65 | + |
| 66 | +### Development Mode |
| 67 | + |
| 68 | +```bash |
| 69 | +npm install |
| 70 | +npm run dev # tsx hot-reload on src/index.ts |
| 71 | +npm run build # compile TypeScript |
| 72 | +npm start # run compiled output |
| 73 | +``` |
| 74 | + |
| 75 | +## API Reference |
| 76 | + |
| 77 | +All `/api/v1/*` endpoints (except webhooks) require HMAC authentication via `X-Signature` and `X-Timestamp` headers. |
| 78 | + |
| 79 | +| Method | Endpoint | Auth | Description | |
| 80 | +|---|---|---|---| |
| 81 | +| `POST` | `/api/v1/validation/trigger` | HMAC | Trigger validation for an issue | |
| 82 | +| `GET` | `/api/v1/validation/:issue_number/status` | HMAC | Get processing status and verdict | |
| 83 | +| `POST` | `/api/v1/validation/:issue_number/requeue` | HMAC | Requeue for re-validation (24 h max, once per issue) | |
| 84 | +| `POST` | `/api/v1/validation/:issue_number/force-release` | HMAC | Clear a stale processing lock | |
| 85 | +| `GET` | `/api/v1/dead-letter` | HMAC | List dead-lettered items | |
| 86 | +| `POST` | `/api/v1/dead-letter/:id/recover` | HMAC | Re-enqueue a dead-letter item | |
| 87 | +| `POST` | `/api/v1/webhooks/github` | GitHub signature | Receive GitHub webhook events | |
| 88 | +| `GET` | `/health` | none | Liveness probe | |
| 89 | +| `GET` | `/ready` | none | Readiness probe | |
| 90 | + |
| 91 | +See [docs/API.md](docs/API.md) for full request/response schemas. |
| 92 | + |
| 93 | +## Configuration |
| 94 | + |
| 95 | +| Variable | Default | Description | |
| 96 | +|---|---|---| |
| 97 | +| `PORT` | `3235` | API listen port | |
| 98 | +| `DATA_DIR` | `./data` | SQLite data directory | |
| 99 | +| `SQLITE_PATH` | `<DATA_DIR>/bounty-bot.db` | Database file path | |
| 100 | +| `REDIS_URL` | `redis://localhost:3231` | Redis connection URL | |
| 101 | +| `GITHUB_TOKEN` | — | GitHub PAT for API requests | |
| 102 | +| `GITHUB_WEBHOOK_SECRET` | — | Secret for verifying GitHub webhook signatures | |
| 103 | +| `INTER_SERVICE_HMAC_SECRET` | — | Shared HMAC secret for Atlas ↔ bounty-bot auth | |
| 104 | +| `ATLAS_WEBHOOK_URL` | `http://localhost:3230/webhooks` | Atlas callback endpoint | |
| 105 | +| `TARGET_REPO` | `PlatformNetwork/bounty-challenge` | GitHub repo to validate (owner/repo) | |
| 106 | +| `OPENROUTER_API_KEY` | — | OpenRouter key for LLM scoring and embeddings | |
| 107 | +| `OPENROUTER_BASE_URL` | `https://openrouter.ai/api/v1` | OpenRouter API base URL | |
| 108 | +| `EMBEDDING_MODEL` | `qwen/qwen3-embedding-8b` | Model for semantic embeddings | |
| 109 | +| `LLM_SCORING_MODEL` | `google/gemini-3.1-pro-preview-customtools` | Model for issue evaluation | |
| 110 | +| `POLLER_INTERVAL` | `60000` | Missed-webhook poller interval (ms) | |
| 111 | +| `MAX_RETRIES` | `3` | Queue retry limit before dead-lettering | |
| 112 | +| `ISSUE_FLOOR` | `41000` | Minimum issue number to process | |
| 113 | +| `SPAM_THRESHOLD` | `0.7` | Spam score threshold (0–1) | |
| 114 | +| `DUPLICATE_THRESHOLD` | `0.75` | Duplicate similarity threshold (0–1) | |
| 115 | +| `REQUEUE_MAX_AGE_MS` | `86400000` | Max issue age for requeue eligibility (24 h) | |
| 116 | +| `WEBHOOK_MAX_RETRIES` | `3` | Atlas callback retry attempts | |
| 117 | +| `WEBHOOK_RETRY_DELAY_MS` | `1000` | Base delay between callback retries (ms) | |
| 118 | + |
| 119 | +## LLM Integration |
| 120 | + |
| 121 | +Bounty-bot uses two AI models via [OpenRouter](https://openrouter.ai) (OpenAI-compatible API): |
| 122 | + |
| 123 | +### Gemini 3.1 Pro Custom Tools — Issue Evaluation |
| 124 | + |
| 125 | +Model: `google/gemini-3.1-pro-preview-customtools` |
| 126 | + |
| 127 | +Used for full issue evaluation and borderline spam scoring. The model receives a system prompt describing the evaluation criteria and is forced to call a `deliver_verdict` function via OpenAI-style tool/function calling. Returns a structured verdict (`valid` / `invalid` / `duplicate`), confidence score, reasoning, and a public-facing recap. |
| 128 | + |
| 129 | +### Qwen3 Embedding 8B — Semantic Duplicate Detection |
| 130 | + |
| 131 | +Model: `qwen/qwen3-embedding-8b` |
| 132 | + |
| 133 | +Generates high-dimensional embedding vectors for issue text. These vectors are stored in SQLite and compared via cosine similarity to detect semantic duplicates that lexical fingerprinting might miss. The final duplicate score is a hybrid: `0.4 × Jaccard + 0.6 × cosine`. |
| 134 | + |
| 135 | +Both models gracefully degrade: if `OPENROUTER_API_KEY` is unset, the system falls back to lexical-only detection and skips LLM scoring. |
| 136 | + |
| 137 | +## Testing |
| 138 | + |
| 139 | +```bash |
| 140 | +npm test # run all 149 tests (vitest) |
| 141 | +npm run typecheck # TypeScript type checking |
| 142 | +npm run lint # ESLint |
| 143 | +``` |
11 | 144 |
|
12 | | -## Endpoints |
| 145 | +## Further Documentation |
13 | 146 |
|
14 | | -- `GET /health` - Health check |
15 | | -- `POST /api/validate` - Trigger validation (Atlas → Bounty-bot) |
16 | | -- `GET /api/status/:issue` - Check validation status |
17 | | -- `POST /api/requeue` - Requeue issue for validation |
| 147 | +- [Architecture](docs/ARCHITECTURE.md) — system design, module graph, sequence diagrams, database schema |
| 148 | +- [API Reference](docs/API.md) — full REST API with request/response schemas |
| 149 | +- [Detection Engine](docs/DETECTION.md) — spam, duplicate, edit-history, and LLM scoring details |
| 150 | +- [Deployment](docs/DEPLOYMENT.md) — Docker, Redis, health checks, Atlas integration |
0 commit comments