Skip to content

Commit 380a609

Browse files
feat: full bounty validation system with CI, Docker, Watchtower
Complete implementation of GitHub issue validation service: - SQLite persistence (8 tables), Redis distributed locking - Validation pipeline: intake, media, spam, duplicate, edit-history - LLM scoring (Gemini 3.1 Pro Custom Tools) + embeddings (Qwen3) - GitHub mutations (labels, comments, close/reopen) - Queue processor with dead-letter and requeue recovery - HMAC auth for Atlas inter-service communication - CI workflow (lint, test, Docker build+push to GHCR) - Watchtower auto-update every 60s - Comprehensive docs (API, Architecture, Detection, Deployment) - 149 tests passing Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
1 parent 5ac3e9e commit 380a609

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

55 files changed

+13411
-25
lines changed

.github/workflows/ci.yml

Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
name: CI
2+
3+
on:
4+
push:
5+
branches: [main]
6+
pull_request:
7+
branches: [main]
8+
9+
permissions:
10+
contents: read
11+
packages: write
12+
13+
jobs:
14+
lint-typecheck:
15+
name: Lint & Typecheck
16+
runs-on: ubuntu-latest
17+
steps:
18+
- uses: actions/checkout@v4
19+
- uses: actions/setup-node@v4
20+
with:
21+
node-version: 22
22+
cache: npm
23+
- run: npm ci
24+
- name: Typecheck
25+
run: npx tsc --noEmit
26+
- name: Lint
27+
run: npm run lint
28+
29+
test:
30+
name: Tests
31+
runs-on: ubuntu-latest
32+
services:
33+
redis:
34+
image: redis:7-alpine
35+
ports:
36+
- 6379:6379
37+
options: >-
38+
--health-cmd "redis-cli ping"
39+
--health-interval 10s
40+
--health-timeout 5s
41+
--health-retries 5
42+
steps:
43+
- uses: actions/checkout@v4
44+
- uses: actions/setup-node@v4
45+
with:
46+
node-version: 22
47+
cache: npm
48+
- run: npm ci
49+
- name: Run tests
50+
run: npx vitest run
51+
env:
52+
REDIS_URL: redis://localhost:6379
53+
54+
docker:
55+
name: Build & Push Docker Image
56+
needs: [lint-typecheck, test]
57+
runs-on: ubuntu-latest
58+
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
59+
steps:
60+
- uses: actions/checkout@v4
61+
62+
- name: Set up Docker Buildx
63+
uses: docker/setup-buildx-action@v3
64+
65+
- name: Login to GitHub Container Registry
66+
uses: docker/login-action@v3
67+
with:
68+
registry: ghcr.io
69+
username: ${{ github.actor }}
70+
password: ${{ secrets.GITHUB_TOKEN }}
71+
72+
- name: Docker metadata
73+
id: meta
74+
uses: docker/metadata-action@v5
75+
with:
76+
images: ghcr.io/${{ github.repository }}
77+
tags: |
78+
type=sha
79+
type=raw,value=latest,enable={{is_default_branch}}
80+
81+
- name: Build and push
82+
uses: docker/build-push-action@v6
83+
with:
84+
context: .
85+
push: true
86+
tags: ${{ steps.meta.outputs.tags }}
87+
labels: ${{ steps.meta.outputs.labels }}
88+
cache-from: type=gha
89+
cache-to: type=gha,mode=max

Dockerfile

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# Bounty-bot Service - Dockerfile
2+
# GitHub bounty validation service, controlled by Atlas via REST API
3+
4+
FROM node:22-slim
5+
6+
# Install system dependencies for better-sqlite3 native module and health checks
7+
RUN apt-get update && apt-get install -y \
8+
python3 \
9+
make \
10+
g++ \
11+
sqlite3 \
12+
curl \
13+
&& rm -rf /var/lib/apt/lists/*
14+
15+
# Create app directory
16+
WORKDIR /app
17+
18+
# Copy package files first for better Docker layer caching
19+
COPY package*.json ./
20+
21+
# Install dependencies with native build flags
22+
RUN npm ci --unsafe-perm || npm install --unsafe-perm
23+
24+
# Copy TypeScript config and source
25+
COPY tsconfig.json ./
26+
COPY src/ ./src/
27+
28+
# Build TypeScript
29+
RUN npm run build
30+
31+
# Create required directories
32+
RUN mkdir -p /app/data
33+
34+
# Environment defaults
35+
ENV NODE_ENV=production
36+
ENV PORT=3235
37+
ENV DATA_DIR=/app/data
38+
39+
# Health check
40+
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
41+
CMD curl -f http://localhost:3235/health || exit 1
42+
43+
# Run the service
44+
CMD ["node", "dist/index.js"]

README.md

Lines changed: 144 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,150 @@
1-
# Bounty Bot
1+
# Bounty-bot
22

3-
GitHub bounty validation bot - controlled by Atlas via API.
3+
Automated GitHub bounty issue validation service. Bounty-bot receives GitHub webhook events (or polls for missed issues), runs every submission through a multi-stage detection pipeline — media checks, spam analysis, duplicate detection, edit-history fraud, and LLM-assisted scoring — then publishes a verdict back to GitHub with labels, comments, and issue state changes.
4+
5+
Controlled by **Atlas** via a REST API with HMAC authentication. Callbacks are sent to Atlas on completion or failure.
46

57
## Architecture
68

7-
- Controlled by [PlatformNetwork/atlas](https://github.com/PlatformNetwork/atlas)
8-
- REST API on port 3235
9-
- SQLite persistence
10-
- GitHub integration for issue validation
9+
```mermaid
10+
flowchart TD
11+
GH[GitHub Webhook] -->|issues.opened| WH[Webhook Receiver]
12+
POLL[Poller] -->|missed events| INT[Intake]
13+
ATLAS[Atlas API call] -->|trigger| INT
14+
WH --> INT
15+
INT -->|filter ≥ #41000| Q[Queue]
16+
Q --> PIPE[Validation Pipeline]
17+
18+
subgraph Pipeline
19+
PIPE --> MEDIA[Media Check]
20+
MEDIA --> SPAM[Spam Detection]
21+
SPAM --> DUP[Duplicate Detection]
22+
DUP --> EDIT[Edit History]
23+
EDIT --> LLM[LLM Validity Gate]
24+
end
25+
26+
LLM --> VERD[Verdict Engine]
27+
VERD -->|labels + comment| GH_MUT[GitHub Mutations]
28+
VERD -->|webhook callback| ATLAS_CB[Atlas Callback]
29+
VERD -->|persist| DB[(SQLite)]
30+
31+
Q -->|max retries| DL[Dead Letter]
32+
```
33+
34+
## Quick Start
35+
36+
### Prerequisites
37+
38+
| Dependency | Version |
39+
|---|---|
40+
| Node.js | ≥ 20 |
41+
| Redis | any (default port 3231) |
42+
| Docker + Compose | optional, for containerised deployment |
43+
44+
### Environment Variables
45+
46+
Create a `.env` file (see [Configuration](#configuration) for the full list):
47+
48+
```env
49+
GITHUB_TOKEN=ghp_...
50+
GITHUB_WEBHOOK_SECRET=your-webhook-secret
51+
INTER_SERVICE_HMAC_SECRET=shared-secret-with-atlas
52+
REDIS_URL=redis://localhost:3231
53+
OPENROUTER_API_KEY=sk-or-...
54+
ATLAS_WEBHOOK_URL=http://localhost:3230/webhooks
55+
TARGET_REPO=PlatformNetwork/bounty-challenge
56+
```
57+
58+
### Docker Compose
59+
60+
```bash
61+
docker compose up -d
62+
```
63+
64+
The service starts on **port 3235** with SQLite data persisted in a Docker volume.
65+
66+
### Development Mode
67+
68+
```bash
69+
npm install
70+
npm run dev # tsx hot-reload on src/index.ts
71+
npm run build # compile TypeScript
72+
npm start # run compiled output
73+
```
74+
75+
## API Reference
76+
77+
All `/api/v1/*` endpoints (except webhooks) require HMAC authentication via `X-Signature` and `X-Timestamp` headers.
78+
79+
| Method | Endpoint | Auth | Description |
80+
|---|---|---|---|
81+
| `POST` | `/api/v1/validation/trigger` | HMAC | Trigger validation for an issue |
82+
| `GET` | `/api/v1/validation/:issue_number/status` | HMAC | Get processing status and verdict |
83+
| `POST` | `/api/v1/validation/:issue_number/requeue` | HMAC | Requeue for re-validation (24 h max, once per issue) |
84+
| `POST` | `/api/v1/validation/:issue_number/force-release` | HMAC | Clear a stale processing lock |
85+
| `GET` | `/api/v1/dead-letter` | HMAC | List dead-lettered items |
86+
| `POST` | `/api/v1/dead-letter/:id/recover` | HMAC | Re-enqueue a dead-letter item |
87+
| `POST` | `/api/v1/webhooks/github` | GitHub signature | Receive GitHub webhook events |
88+
| `GET` | `/health` | none | Liveness probe |
89+
| `GET` | `/ready` | none | Readiness probe |
90+
91+
See [docs/API.md](docs/API.md) for full request/response schemas.
92+
93+
## Configuration
94+
95+
| Variable | Default | Description |
96+
|---|---|---|
97+
| `PORT` | `3235` | API listen port |
98+
| `DATA_DIR` | `./data` | SQLite data directory |
99+
| `SQLITE_PATH` | `<DATA_DIR>/bounty-bot.db` | Database file path |
100+
| `REDIS_URL` | `redis://localhost:3231` | Redis connection URL |
101+
| `GITHUB_TOKEN` || GitHub PAT for API requests |
102+
| `GITHUB_WEBHOOK_SECRET` || Secret for verifying GitHub webhook signatures |
103+
| `INTER_SERVICE_HMAC_SECRET` || Shared HMAC secret for Atlas ↔ bounty-bot auth |
104+
| `ATLAS_WEBHOOK_URL` | `http://localhost:3230/webhooks` | Atlas callback endpoint |
105+
| `TARGET_REPO` | `PlatformNetwork/bounty-challenge` | GitHub repo to validate (owner/repo) |
106+
| `OPENROUTER_API_KEY` || OpenRouter key for LLM scoring and embeddings |
107+
| `OPENROUTER_BASE_URL` | `https://openrouter.ai/api/v1` | OpenRouter API base URL |
108+
| `EMBEDDING_MODEL` | `qwen/qwen3-embedding-8b` | Model for semantic embeddings |
109+
| `LLM_SCORING_MODEL` | `google/gemini-3.1-pro-preview-customtools` | Model for issue evaluation |
110+
| `POLLER_INTERVAL` | `60000` | Missed-webhook poller interval (ms) |
111+
| `MAX_RETRIES` | `3` | Queue retry limit before dead-lettering |
112+
| `ISSUE_FLOOR` | `41000` | Minimum issue number to process |
113+
| `SPAM_THRESHOLD` | `0.7` | Spam score threshold (0–1) |
114+
| `DUPLICATE_THRESHOLD` | `0.75` | Duplicate similarity threshold (0–1) |
115+
| `REQUEUE_MAX_AGE_MS` | `86400000` | Max issue age for requeue eligibility (24 h) |
116+
| `WEBHOOK_MAX_RETRIES` | `3` | Atlas callback retry attempts |
117+
| `WEBHOOK_RETRY_DELAY_MS` | `1000` | Base delay between callback retries (ms) |
118+
119+
## LLM Integration
120+
121+
Bounty-bot uses two AI models via [OpenRouter](https://openrouter.ai) (OpenAI-compatible API):
122+
123+
### Gemini 3.1 Pro Custom Tools — Issue Evaluation
124+
125+
Model: `google/gemini-3.1-pro-preview-customtools`
126+
127+
Used for full issue evaluation and borderline spam scoring. The model receives a system prompt describing the evaluation criteria and is forced to call a `deliver_verdict` function via OpenAI-style tool/function calling. Returns a structured verdict (`valid` / `invalid` / `duplicate`), confidence score, reasoning, and a public-facing recap.
128+
129+
### Qwen3 Embedding 8B — Semantic Duplicate Detection
130+
131+
Model: `qwen/qwen3-embedding-8b`
132+
133+
Generates high-dimensional embedding vectors for issue text. These vectors are stored in SQLite and compared via cosine similarity to detect semantic duplicates that lexical fingerprinting might miss. The final duplicate score is a hybrid: `0.4 × Jaccard + 0.6 × cosine`.
134+
135+
Both models gracefully degrade: if `OPENROUTER_API_KEY` is unset, the system falls back to lexical-only detection and skips LLM scoring.
136+
137+
## Testing
138+
139+
```bash
140+
npm test # run all 149 tests (vitest)
141+
npm run typecheck # TypeScript type checking
142+
npm run lint # ESLint
143+
```
11144

12-
## Endpoints
145+
## Further Documentation
13146

14-
- `GET /health` - Health check
15-
- `POST /api/validate` - Trigger validation (Atlas → Bounty-bot)
16-
- `GET /api/status/:issue` - Check validation status
17-
- `POST /api/requeue` - Requeue issue for validation
147+
- [Architecture](docs/ARCHITECTURE.md) — system design, module graph, sequence diagrams, database schema
148+
- [API Reference](docs/API.md) — full REST API with request/response schemas
149+
- [Detection Engine](docs/DETECTION.md) — spam, duplicate, edit-history, and LLM scoring details
150+
- [Deployment](docs/DEPLOYMENT.md) — Docker, Redis, health checks, Atlas integration

docker-compose.yml

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
# Bounty-bot Service - Docker Compose
2+
# Port allocation: 3235-3239 (Bounty-bot services)
3+
# Atlas uses 3230-3234 separately
4+
#
5+
# Watchtower polls GHCR every 60s and auto-restarts on new images.
6+
# CI pushes ghcr.io/platformnetwork/bounty-bot:latest on main.
7+
8+
services:
9+
bounty-bot:
10+
image: ghcr.io/platformnetwork/bounty-bot:latest
11+
build:
12+
context: .
13+
dockerfile: Dockerfile
14+
container_name: bounty-bot
15+
restart: unless-stopped
16+
ports:
17+
- "3235:3235"
18+
environment:
19+
- NODE_ENV=production
20+
- PORT=3235
21+
- DATA_DIR=/app/data
22+
- REDIS_URL=redis://redis:6379
23+
- TZ=UTC
24+
volumes:
25+
- bounty-data:/app/data
26+
depends_on:
27+
redis:
28+
condition: service_healthy
29+
healthcheck:
30+
test: ["CMD", "curl", "-f", "http://localhost:3235/health"]
31+
interval: 30s
32+
timeout: 10s
33+
retries: 3
34+
start_period: 10s
35+
networks:
36+
- bounty-network
37+
labels:
38+
- "com.centurylinklabs.watchtower.enable=true"
39+
deploy:
40+
resources:
41+
limits:
42+
memory: 1G
43+
reservations:
44+
memory: 256M
45+
46+
redis:
47+
image: redis:7-alpine
48+
container_name: bounty-redis
49+
restart: unless-stopped
50+
command: ["redis-server", "--appendonly", "yes", "--maxmemory", "256mb", "--maxmemory-policy", "allkeys-lru"]
51+
volumes:
52+
- redis-data:/data
53+
healthcheck:
54+
test: ["CMD", "redis-cli", "ping"]
55+
interval: 10s
56+
timeout: 5s
57+
retries: 5
58+
start_period: 5s
59+
networks:
60+
- bounty-network
61+
deploy:
62+
resources:
63+
limits:
64+
memory: 512M
65+
reservations:
66+
memory: 64M
67+
68+
watchtower:
69+
image: containrrr/watchtower:latest
70+
container_name: bounty-watchtower
71+
restart: unless-stopped
72+
environment:
73+
- WATCHTOWER_POLL_INTERVAL=60
74+
- WATCHTOWER_CLEANUP=true
75+
- WATCHTOWER_LABEL_ENABLE=true
76+
- WATCHTOWER_ROLLING_RESTART=true
77+
- WATCHTOWER_INCLUDE_RESTARTING=true
78+
- WATCHTOWER_REVIVE_STOPPED=true
79+
- WATCHTOWER_NOTIFICATIONS=shoutrrr
80+
- WATCHTOWER_LIFECYCLE_HOOKS=true
81+
volumes:
82+
- /var/run/docker.sock:/var/run/docker.sock
83+
- /root/.docker/config.json:/config.json:ro
84+
networks:
85+
- bounty-network
86+
deploy:
87+
resources:
88+
limits:
89+
memory: 128M
90+
91+
networks:
92+
bounty-network:
93+
driver: bridge
94+
95+
volumes:
96+
bounty-data:
97+
name: bounty-bot-data
98+
redis-data:
99+
name: bounty-redis-data

0 commit comments

Comments
 (0)