Stranger Prompts

A continuous evaluation and iteration platform for AI prompt systems. Built with Next.js, Supabase, Prisma, and Inngest.

🚀 Live Demo: https://stranger-prompts.vercel.app

Features

Level 0 (Core): Define prompt systems, upload datasets, run evaluations, view results
Level 1 (Monitor): Schedule automated runs, detect drastic output changes, Slack DM alerts
Level 2 (Experimentation): Cross-model comparison, side-by-side results
Level 3 (Optimize): Iterative prompt optimization

Key Logic & Features

De-duplication Logic: To maintain a clean history, the system only creates a new version if the template or model configuration actually changes. If you save a prompt that matches an existing version, it simply links back to that one.
Genetic Optimization (Level 3): Iterative loop that improves prompts by analyzing failing examples.
- Optimizer Model: Uses GPT-4o by default as the expert prompt engineer.
- Process: Analyzes worst K failing rows per iteration, proposes incremental template improvements, and prunes candidates based on aggregate scores.
- Safety: Automatically detects and rejects "reward-hacking" templates that overfit to specific test strings.

Tech Stack

Frontend: Next.js 14 (App Router), React 18, TypeScript, Tailwind CSS, shadcn/ui
Auth: Supabase Auth (Google SSO)
Database: Supabase Postgres + Prisma ORM
Background Jobs: Inngest
LLM Providers: OpenAI, Anthropic, Google Gemini (BYOK)

Prerequisites

Node.js 18+
npm or yarn
Supabase account (free tier works)
Inngest account (free tier works)

Quick Start (Raw Mac Setup)

1. Clone and Install

cd /path/to/PromptOps
npm install

2. Set Up Supabase

Create a new Supabase project at https://supabase.com
Go to Authentication → Providers → Enable Google
Configure Google OAuth credentials in your Google Cloud Console
Get your Supabase credentials from Settings → API

3. Configure Environment

cp .env.example .env

Edit .env with your values:

# Supabase
NEXT_PUBLIC_SUPABASE_URL=https://your-project.supabase.co
NEXT_PUBLIC_SUPABASE_ANON_KEY=your-anon-key
SUPABASE_SERVICE_ROLE_KEY=your-service-role-key

# Database (from Supabase Settings → Database → Connection string)
DATABASE_URL=postgresql://postgres:password@db.your-project.supabase.co:5432/postgres

# Encryption key for API keys (generate with: openssl rand -hex 32)
ENCRYPTION_KEY=your-64-char-hex-key

# Optional: Slack Bot Token for DM alerts
SLACK_BOT_TOKEN=xoxb-your-slack-bot-token

4. Set Up Database

# Generate Prisma client
npm run db:generate

# Push schema to database
npm run db:push

# Seed with sample data
npm run seed

5. Start Development Server

In one terminal:

npm run dev

In another terminal (for background jobs):

npx inngest-cli@latest dev

6. Access the App

App: http://localhost:3000
Inngest Dashboard: http://localhost:8288

Production Deployment

This section covers deploying Stranger Prompts to production with Vercel, Supabase, and Inngest.

1. Google OAuth Setup

Go to Google Cloud Console
Create a new project (or select existing)
Navigate to APIs & Services → OAuth consent screen
- Choose "External" user type
- Fill in app name, support email, and developer contact
- Add scopes: email, profile, openid
Go to APIs & Services → Credentials → Create Credentials → OAuth 2.0 Client ID
- Application type: Web application
- Name: Stranger Prompts
- Authorized redirect URIs: https://YOUR-PROJECT.supabase.co/auth/v1/callback
Copy the Client ID and Client Secret for Supabase setup

2. Supabase Setup

Create a new project at supabase.com
Configure Google Auth:
- Go to Authentication → Providers → Google
- Enable Google provider
- Paste your Google Client ID and Client Secret
- Save

Get API Keys from Settings → API:

NEXT_PUBLIC_SUPABASE_URL=https://YOUR-PROJECT.supabase.co
NEXT_PUBLIC_SUPABASE_ANON_KEY=eyJ...
SUPABASE_SERVICE_ROLE_KEY=eyJ...

Get Database Connection Strings from Settings → Database → Connection string:

Transaction Pooler (for serverless/Vercel):

DATABASE_URL=postgresql://postgres.YOUR-PROJECT:[PASSWORD]@aws-0-REGION.pooler.supabase.com:6543/postgres?pgbouncer=true

Direct Connection (for migrations):

DIRECT_URL=postgresql://postgres.YOUR-PROJECT:[PASSWORD]@aws-0-REGION.pooler.supabase.com:5432/postgres

3. Inngest Setup

Create an account at inngest.com
Create a new app in the Inngest Dashboard

Go to Manage → Signing Key to get your keys:

INNGEST_EVENT_KEY=your-event-key
INNGEST_SIGNING_KEY=signkey-prod-...

Note: You'll configure the app URL after Vercel deployment

4. Vercel Deployment

Deploy to Vercel:
```
npm i -g vercel
vercel
```
Or connect your GitHub repo to Vercel for automatic deployments.
Add Environment Variables in Vercel Dashboard → Settings → Environment Variables:
- All variables from .env.example
- Make sure DATABASE_URL uses the pooler connection with ?pgbouncer=true

Run Database Migrations:

# Locally with DIRECT_URL set
npx prisma migrate deploy

Configure Inngest App URL:
- In Inngest Dashboard → Your App → App URL
- Set to: https://your-app.vercel.app/api/inngest

5. Vercel Deployment Protection Bypass (Critical for Inngest)

If you have Vercel deployment protection enabled (preview deployments, password protection, etc.), Inngest won't be able to reach your /api/inngest endpoint. You must configure a bypass:

In Vercel Dashboard:
- Go to Settings → Deployment Protection
- Scroll to Protection Bypass for Automation
- Click Generate Secret
- Copy the generated secret
In Inngest Dashboard:
- Go to your App → Settings
- Find Vercel Protection Bypass
- Paste the bypass secret
This allows Inngest to invoke your background functions even when deployment protection is enabled.

6. Slack Integration Setup

Slack integration enables DM alerts for output change detection and quality regressions.

Create a Slack App:
- Go to api.slack.com/apps
- Click Create New App → From scratch
- Name: Stranger Prompts Alerts
- Select your workspace
Configure Bot Token Scopes:
- Go to OAuth & Permissions → Scopes → Bot Token Scopes
- Add these scopes:
  - users:read.email (lookup users by email)
  - chat:write (send messages)
  - im:write (open DM channels)
Install to Workspace:
- Go to OAuth & Permissions → Install to Workspace
- Authorize the app
- Copy the Bot User OAuth Token (xoxb-...)
```
SLACK_BOT_TOKEN=xoxb-your-token
```
Create Workspace Invite Link (Optional):
- In Slack, go to your workspace settings
- Create a shared invite link
```
NEXT_PUBLIC_SLACK_INVITE_URL=https://join.slack.com/t/your-workspace/shared_invite/...
```
User Setup:
- Users must join your Slack workspace
- In the app Settings page, users enter their Slack email address
- They can test the integration with "Send Test DM"

Environment Variables Reference

Variable	Required	Description
`NEXT_PUBLIC_SUPABASE_URL`	✅	Supabase project URL
`NEXT_PUBLIC_SUPABASE_ANON_KEY`	✅	Supabase anonymous/public key
`SUPABASE_SERVICE_ROLE_KEY`	✅	Supabase service role key (server-side only)
`DATABASE_URL`	✅	Postgres connection string (use pooler with `?pgbouncer=true` for Vercel)
`DIRECT_URL`	✅	Postgres direct connection (for migrations)
`ENCRYPTION_KEY`	✅	64-character hex key for BYOK encryption (`openssl rand -hex 32`)
`INNGEST_EVENT_KEY`	✅	Inngest event key for sending events
`INNGEST_SIGNING_KEY`	✅	Inngest signing key for webhook verification
`SLACK_BOT_TOKEN`	❌	Slack bot token for DM alerts (`xoxb-...`)
`NEXT_PUBLIC_SLACK_INVITE_URL`	❌	Slack workspace invite link (shown in Settings UI)

UI Testing Walkthrough (Levels 0–2)

First-Time Setup

Sign in with Google at http://localhost:3000
Go to Settings → Add your OpenAI API key → Click "Save Key"
Click "Test" on the saved key row to verify it works
(Optional) Add your Slack Member ID for output change alerts

Claim Seeded Data

If you ran npm run seed, claim the demo data for your account:

# In browser console or via curl:
fetch('/api/dev/claim-demo-data', {method: 'POST'}).then(r => r.json()).then(console.log)

Refresh the dashboard to see "Movie Sentiment Classifier".

Level 0: Core Testing (via UI)

Step	UI Action
1	Dashboard → Click "Movie Sentiment Classifier" (or create new system)
2	Datasets tab → Upload a CSV or JSONL file (download samples from the links)
3	Evaluations tab → Create an eval config (e.g., CONTAINS type)
4	Run Test tab → Select dataset + eval → Click "Run Now"
5	Watch "Recent Runs" panel update → Click a run to see row-by-row results

Level 1: Scheduling & Monitoring (via UI)

Step	UI Action
1	System page → Run Test tab → Select dataset & eval
2	Set interval (e.g., `60` seconds for testing, or use presets: Hourly/Daily/Weekly)
3	Click "Save Schedule Configuration" to save your settings
4	Toggle the schedule switch ON to enable automatic runs
5	Watch Inngest dashboard (http://localhost:8288) → See `system-scheduler` function start
6	After 2+ scheduled runs, `output-change-alert` compares outputs
7	Check Notifications page for alerts (Dashboard → Notifications)
8	Toggle the schedule switch OFF when done

Note: The scheduler is event-driven—it only runs when scheduling is enabled for a system, not polling every minute.

Level 2: Cross-Model Comparison (via UI)

Step	UI Action
1	System page → Compare Models tab
2	Select dataset and evaluation config
3	Click "+ Add Model" → Select provider/model (e.g., gpt-4o)
4	Add more models to compare (e.g., gpt-4o-mini, claude-3-haiku)
5	Click "Run Comparison"
6	Results appear below → Click each run to see detailed scores

Level 3: Optimization (Bonus, via UI)

Step	UI Action
1	System page → Optimize tab
2	Select dataset and evaluation config
3	Set max iterations (1-10) and target score (0-1)
4	Click "Start Optimization"
5	Check Inngest dashboard for `optimize` function progress
6	New prompt versions created after each iteration

Sample Datasets

Download from UI or find in /public/sample/:

reviews.csv: Movie reviews → sentiment (positive/negative/neutral)
toxicity.jsonl: Text → toxicity classification (toxic/not_toxic)

Custom dataset requirements:

CSV: columns matching {{variables}} in prompt + expected column
JSONL: each line {"inputs": {...}, "expected": "..."}
Max 10,000 rows

Dataset Format

CSV

review,expected
"Great movie!",positive
"Terrible film.",negative

JSONL

{"review": "Great movie!", "expected": "positive"}
{"review": "Terrible film.", "expected": "negative"}

Required: Column/field named expected

Evaluation Types

Type	Description
`EXACT_MATCH`	Output must exactly match expected (case-insensitive)
`CONTAINS`	Output must contain expected string
`REGEX`	Output must match regex pattern
`JSON_SCHEMA`	Output JSON must validate against schema
`LLM_JUDGE`	LLM evaluates output quality (strict JSON response)

API Endpoints

Method	Endpoint	Description
GET/POST	`/api/systems`	List/create prompt systems
GET/PATCH	`/api/systems/:id`	Get/update system
POST	`/api/systems/:id/versions`	Create new version
POST	`/api/systems/:id/schedule`	Configure scheduling
GET/POST	`/api/datasets`	List/upload datasets
GET/POST	`/api/runs`	List/create test runs
GET	`/api/runs/:id/results`	Get paginated results
GET/POST	`/api/keys`	List/save API keys
POST	`/api/keys/test`	Test API key validity
POST	`/api/compare`	Start cross-model comparison
POST	`/api/optimize`	Start optimization loop
GET/PATCH	`/api/notifications`	List/mark notifications read
GET/PATCH	`/api/user`	Get/update user settings

Architecture

src/
├── app/                    # Next.js App Router pages
│   ├── api/               # API routes
│   ├── dashboard/         # Dashboard page
│   ├── login/             # Login page
│   └── settings/          # Settings page
├── components/ui/         # shadcn/ui components
└── lib/
    ├── inngest/           # Inngest functions
    │   ├── client.ts      # Inngest client + event definitions
    │   ├── index.ts       # Function exports
    │   └── functions/     # Background job handlers
    │       ├── dataset-ingest.ts    # Dataset ingestion
    │       ├── run-execute.ts       # Test run execution
    │       ├── system-scheduler.ts  # Event-driven scheduler (per-system)
    │       ├── output-change-alert.ts # Output change detection
    │       └── optimize.ts          # Prompt optimization loop
    ├── llm/               # LLM provider adapters
    ├── eval/              # Evaluation logic
    ├── supabase/          # Supabase client utilities
    ├── crypto.ts          # Encryption for BYOK
    ├── prisma.ts          # Prisma client
    ├── slack.ts           # Slack DM integration
    └── utils.ts           # Utility functions

Inngest Events

Event	Description
`dataset/ingest.requested`	Triggered when a dataset is uploaded
`run/execute.requested`	Triggered to start a test run
`run/completed`	Emitted when a run finishes (triggers output change detection)
`system/schedule.started`	Starts the event-driven scheduler for a system
`system/schedule.stopped`	Cancels the scheduler for a system
`optimize/start.requested`	Starts the optimization loop

Robustness Features

Row-level fault tolerance: Individual row failures don't crash the run
Run failure threshold: Run marked FAILED only if >50% rows fail
Retry with backoff: Transient errors retry with exponential backoff + jitter
Idempotent results: Upsert on (runId, rowIndex) prevents duplicates
Pileup prevention: Scheduler skips if a run is already QUEUED/RUNNING
Concurrency limits: Per-user run limits, per-dataset ingestion limits
Event-driven scheduling: Schedulers only run when enabled, not polling every minute
Scheduler lifecycle management: Interval changes automatically restart the scheduler with new settings

License

This project is licensed under the Business Source License 1.1 (BSL 1.1).

Permitted: Non-production use, internal use, modifications, contributions
Not Permitted: Offering as a competing Prompt Evaluation Service
Change Date: January 1, 2029 (converts to Apache 2.0)

See LICENSE for full terms. For commercial licensing inquiries, please contact the maintainer.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
docs		docs
prisma		prisma
public		public
src		src
test-data		test-data
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
PRD.md		PRD.md
README.md		README.md
next.config.mjs		next.config.mjs
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

Stranger Prompts

Features

Key Logic & Features

Tech Stack

Prerequisites

Quick Start (Raw Mac Setup)

1. Clone and Install

2. Set Up Supabase

3. Configure Environment

4. Set Up Database

5. Start Development Server

6. Access the App

Production Deployment

1. Google OAuth Setup

2. Supabase Setup

3. Inngest Setup

4. Vercel Deployment

5. Vercel Deployment Protection Bypass (Critical for Inngest)

6. Slack Integration Setup

Environment Variables Reference

UI Testing Walkthrough (Levels 0–2)

First-Time Setup

Claim Seeded Data

Level 0: Core Testing (via UI)

Level 1: Scheduling & Monitoring (via UI)

Level 2: Cross-Model Comparison (via UI)

Level 3: Optimization (Bonus, via UI)

Sample Datasets

Dataset Format

CSV

JSONL

Evaluation Types

API Endpoints

Architecture

Inngest Events

Robustness Features

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages